File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/92/h92-1057_abstr.xml

Size: 1,692 bytes

Last Modified: 2025-10-06 13:47:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1057">
  <Title>Experimental Results for Baseline Speech Recognition Performance using Input Acquired from a Linear Microphone Array</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> In this paper, baseline speech recognition performance is determined both for a single remote microphone and for a signal derived from a delay-and-sum beamformer using an eight-microphone linear array.</Paragraph>
    <Paragraph position="1"> An HMM-based, connected-speech, 38-word vocabulary (alphabet, digits, 'space', 'period'), talker-independent speech recognition system is used for testing performance. Normal performance, with no language model, i.e., raw word-level performance, is currently about 81% for a set of talkers not in the training set and about 91% for training set data. The system has been trained and tested using a close-talking bead-mounted microphone. Since a meaningful comparison requires using the same speech, the existing speech database was appropriately pre-filtered, played out through a transducer (speaker) in the room environment, picked-up by the microphone array, and re-stored as a digital file. The resulting file was post-processed and used as input to the recognizer; the recognition performance indicates the effect of the input device. The baseline experiment showed that both a single remote microphone and the beamformed signal reduced performance by 12% in a room with no other talkers. For the array tested, the error is generally attributable to reverberation off the floor and ceiling.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML