File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1014_intro.xml
Size: 2,440 bytes
Last Modified: 2025-10-06 14:02:07
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1014"> <Title>Modeling of Long Distance Context Dependency</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Language modeling is the attempt to characterize, capture and exploit the regularities and constraints in a natural language and has been successfully applied to many domains. Among all the language modeling approaches, ngram models have been most widely used in speech recognition (Jelinek 1990; Gale and Church 1990; Brown et al. 1992; Yang et al. 1996) and other applications. While ngram models are simple in language modeling and have been successfully used in speech recognition and other tasks, they have obvious deficiencies.</Paragraph> <Paragraph position="1"> For instance, ngram models can only capture the short-distance dependency within an n-words window where currently the largest practical N for a natural language is three.</Paragraph> <Paragraph position="2"> In the meantime, it is found that there always exist many preferred relationships between words. Two highly associated word pairs are &quot;not only/but also&quot; and &quot;doctor/nurse&quot;. Psychological experiments in Meyer D. et al. (1975) indicated that the human's reaction to a highly associated word pair was stronger and faster than that to a poorly associated word pair. Such preference information is very useful for natural language processing (Church K.W. et al. 1990; Hiddle D. et al. 1993; Rosenfeld R. 1994 and Zhou G.D. et al.1998). Obviously, the preference relationships between words can expand from a short to long distance. While we can use conventional ngram models to capture the short distance dependency, the long distance dependency should also be exploited properly.</Paragraph> <Paragraph position="3"> The purpose of this paper is to propose a new modeling approach to capture the context dependency over both a short distance and a long distance and apply it in Mandarin speech recognition.</Paragraph> <Paragraph position="4"> This paper is organized as follows. In Section 2, we present the normal ngram modeling while a new modeling approach, named MI-ngram modeling, is proposed in Section 3. In Section 4, we will describe its use in our Mandarin speech recognition system. Finally we give a summary of this paper.</Paragraph> </Section> class="xml-element"></Paper>