File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/h92-1021_intro.xml
Size: 1,408 bytes
Last Modified: 2025-10-06 14:05:19
<?xml version="1.0" standalone="yes"?> <Paper uid="H92-1021"> <Title>IMPROVEMENTS IN STOCHASTIC LANGUAGE MODELING</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. INTRODUCTION </SectionTitle> <Paragraph position="0"> Linguistic constraints are an important factor in human comprehension of speech. Their effect on automatic speech recognition is similar, in that they provide both a pruning method and a means of ordering likely candidates. As vocabularies for speech recognition systems increase in size, more accurate modeling of linguistic constraints becomes essential.</Paragraph> <Paragraph position="1"> Two fundamental issues in language modeling are smoothing and adaptation. Smoothing allows a model to assign reasonable probabilities to events that have never been observed before. Adaptation takes advantage of recently gained knowledge -- the text seen so far -- to adjust the model's expectations.</Paragraph> <Paragraph position="2"> In what follows, we discuss two attempts at improving our current stochastic language modeling techniques. In the first, we try to improve smoothing by correcting a deficiency in a successful and well known smoothing method, the backoff model. In the second, we propose a novel kind of adaptation, one that is based on correlation among word sequences occurring in the same document.</Paragraph> </Section> class="xml-element"></Paper>