XML Viewer - h92-1055

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/h92-1055_intro.xml
Size: 2,038 bytes
Last Modified: 2025-10-06 14:05:18
<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1055">
  <Title>MULTIPLE APPROACHES TO ROBUST SPEECH RECOGNITION</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> The need for speech recognition systems and spoken language systems to be robust with respect to their acoustical environment has become more widely appreciated in recent years (e.g. \[1\]).</Paragraph>
    <Paragraph position="1"> Results of several studies have demonstrated that even automatic speech recognition systems that are designed to be speaker independent can perform very poorly when they are tested using a different type of microphone or acoustical environment from the one with which they were trained (e.g. \[2, 3\]), even in a relatively quiet office environment. Applications such as speech recognition over telephones, in automobiles, on a factory floor, or outdoors demand an even greater degree of environmental robustness. null The CMU speech group is committed to the development of speech recognition systems that are robust with respect to environmental variation, just as it has been an early proponent of speaker-independent recognition. While most of our work presented to date has described new acoustical pre-processing algorithms (e.g. \[2, 4, 5\], we have always regarded pre-processing as one of several approaches that must be developed in concert to achieve robust recognition. null The purpose of this paper is twofold. First, we describe our results for the DARPA benchmark evaluation for robust speech recognition for the ATIS task, discussing the effectiveness of our methods of acoustical prepreprocessing in the context of this task. Second, we describe and compare the effectiveness of three complementary methods of signal processing for robust speech recognition: acoustical pre-processing, microphone array processing, and the use of physiologically-motivated models of peripheral signal processing.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML