File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/90/h90-1062_concl.xml

Size: 3,883 bytes

Last Modified: 2025-10-06 13:56:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="H90-1062">
  <Title>IMPROVED ACOUSTIC MODELING FOR CONTINUOUS SPEECH RECOGNITION</Title>
  <Section position="9" start_page="324" end_page="325" type="concl">
    <SectionTitle>
6. SUMMARY
</SectionTitle>
    <Paragraph position="0"> We have reported on several improvements to one of the speaker-independent, continuous speech recognition systems developed at AT&amp;T Bell Laboratories. The improved acoustic modeling, including incorporation of inter-word units and an improved feature analysis, provided high word accuracies for all three DARPA evaluation sets using the word pair grammar. We have also developed a unit selection rule for selecting intra-word and inter-word units independently. We anticipate that with the proposed unit expansion rule, an even better set of units can be obtained which will further improve acoustic modeling techniques for continuous speech recognition.</Paragraph>
    <Paragraph position="1"> Based on current recognition performance it seems fair to say that when task-specific training data are provided for acoustic modeling of the set of basic speech units using HMM's, high performance can be achieved for a large vocabulary, speaker-independent, continuous speech recognition task with a perplexity of about 60. However, there are still some open issues that need to be addressed. The reader is referred to a recent paper \[15\] for a discussion of some of those issues related to using HMM's for speech recognition. We list, in the following, a number of acoustic modeling issues which we believe to be essential for expanding the capabilities of our current continuous speech recognition system. They are:  (1) Speech unit selection and modeling for task null independent applications; (2) Improved word discrimination based on some form of corrective training for continuous density HMM parameters (e.g. \[16\]); (3) Lexical modeling to deal with lexical variability in baseform pronunciation; and (4) Improved feature selection so that only those features useful to discrimination are included in the feature vector.</Paragraph>
    <Paragraph position="2"> Our tasks so far have been mainly focused on speech recognition. We observed that short function words (e.g. &amp;quot;a&amp;quot;, &amp;quot;the&amp;quot;) are a major source of recognition errors. However, most of those errors can be corrected using a set of simple syntactic and semantic rules operating in a  post-processing mode. For example, for the resource management task, we have developed a language decoder (decoupled from the acoustic decoder) that incorporates a set of simple rules. Our preliminary results \[17\] indicate that sentence accuracy for the FEB89 test set improved from 70% to close to 90% (with 98% word accuracy) when the top candidate string decoded using the word-pair grammar is used as input to this language decoder. When no grammatical constraints were used in speech decoding, the sentence accuracy improved from 24% to 67% (with 90% word accuracy). Except in cases where some key content words were misrecognized, the simple language analyzer properly decoded the noisy strings provided by the speech decoder without going back to the acoustic domain to request mismatch information.</Paragraph>
    <Paragraph position="3"> We believe the language decoder can be more effective if the speech decoder can provide more word and string hypotheses. One way to get more information is to use the N-best string search strategies. Another way is to construct, in acoustic decoding, a phone lattice and a word lattice that contain more word hypotheses, and then generate recognized strings according to the language constraints. The effectiveness of such approaches in real spoken language tasks, such as the DARPA Air Travel Information System (ATIS) task, is yet to be evaluated.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML