File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/91/h91-1057_abstr.xml
Size: 1,338 bytes
Last Modified: 2025-10-06 13:47:11
<?xml version="1.0" standalone="yes"?> <Paper uid="H91-1057"> <Title>A Dynamic Language Model for Speech Recognition</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> ABSTRACT </SectionTitle> <Paragraph position="0"> In the case of a trlgr~m language model, the probability of the next word conditioned on the previous two words is estimated from a large corpus of text. The resulting static trigram language model (STLM) has fixed probabilities that are independent of the document being dictated. To improve the language mode\] (LM), one can adapt the probabilities of the trigram language model to match the current document more closely. The partially dictated document provides significant clues about what words ~re more likely to be used next. Of many methods that can be used to adapt the LM, we describe in this paper a simple model based on the trigram frequencies estimated from the partially dictated document. We call this model ~ cache trigram language model (CTLM) since we are c~chlng the recent history of words. We have found that the CTLM red,aces the perplexity of a dictated document by 23%. The error rate of a 20,000-word isolated word recognizer decreases by about 5% at the beginning of a document and by about 24% after a few hundred words.</Paragraph> </Section> class="xml-element"></Paper>