File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/89/h89-2038_intro.xml

Size: 2,903 bytes

Last Modified: 2025-10-06 14:04:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="H89-2038">
  <Title>Large-Vocabulary Speaker-Independent Continuous Speech Recognition with Semi.Continuous Hidden Markov Models</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
USA
</SectionTitle>
    <Paragraph position="0"> probability density functions make more assumption than the discrete HMM, especially when the diagonal cov*riance Gsussian probability density is used for simplicity \[15\]. To obtain a better recognition accuracy, acoustic parameters must be well chosen according to the assumption of the continuous probability density functions used The sent.continuous hidden Markov model 'SCHMM~ has been proposed to ext,'nd the d,screte tIMM by replacing discrete output probabthty d,strtbuttons wash a combination of the origtnal discrete output probabthty distributions and continuous probabil. ~ty density functions of* Gaussian codebook {6\]. In the SCHMM.</Paragraph>
    <Paragraph position="1"> each VQ codeword is regarded as a Gaussian probability dens,ty \[ntuttlvely. from the discrete HMM point of view, the SCHMM tries to smooth the discrete output probabilities with multiple codewordcandidates in VQ procedure From the continuous mixture HMM point of view, the SCHMM ties all the continuous output probability densities across each individual HMM to form a shared Gaussian codebook, i e. a mixture of Gaussian probability densities. With the SCHMM. the codebook and HMM can be jointly re-estimated to achieve an optimal eodebookmodel combination in sense of maxtmum likelihood criterion. Such a tying can also substantially reduce the number of free parameters and computational complexity in comparison to the continuous mixture HMM. while mains*in reasonablelv modeling power of a mixture of * I'~t'ge number of probability density functions. The SCHMM has shown to offer improved recognition accuracy in several speech recognition experiments (6.8, 14,2\].</Paragraph>
    <Paragraph position="2"> \[n this study, the SCHMM is applied to Sphinx, * speaker-independent continuous speech recognition system. Sphinx uses multiple VQ codebooks for each acoustic observation \[12}. To apply the SCHMM to Sphinx. the SCHMM algorithm must be modified to accommodate multiple C/odebooks and multiple codewords combination. For the SCHMM re-estimation algorithm, the modified unified re.estimation algorithm for multiple VQ codebooks and bidden Markov models are proposed in this paper. The spplicability of the SCHMM to speaker-independent conUnuous speech is explored based on 200 generalized triphone models \[12\]. In the t000-word speaker-independent continuous speech recognition task using word-pair grammar, the error rate was reduced by more than 29&amp;quot;, and 41/', in comparison to the corresponding discrete HMM and continuous maxture HMM respectively.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML