File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/p97-1022_intro.xml
Size: 5,591 bytes
Last Modified: 2025-10-06 14:06:14
<?xml version="1.0" standalone="yes"?> <Paper uid="P97-1022"> <Title>Fertility Models for Statistical Natural Language Understanding</Title> <Section position="3" start_page="0" end_page="168" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The goal of a natural language understanding (NLU) system is to interpret a user's request and respond with an appropriate action. We view this interpretation as translation from a natural language expression, E, into an equivalent expression, F, in an unambigous formal language. Typically, this formal language will be hand-crafted to enhance performance on some task-specific domain. A statistical NLU system translates a request E as the most likely formal expression ~' according to a probability model p, = are maxp(F\[E) --- are maxp(F, E).</Paragraph> <Paragraph position="1"> over all F over all F We have previously built a fully automatic statistical NLU system (Epstein et al., 1996) based on the source-channel factorization of the joint distribution p(f , E) p(f , E) = p(f)p(ZlF ).</Paragraph> <Paragraph position="2"> This factorization, which has proven effective in speech recognition (Bahl, Jelinek, and Mercer, 1983), partitions the joint probability into an a priori intention model p(F), and a translation model p(E\[F) which models how a user might phrase a request F in English.</Paragraph> <Paragraph position="3"> For the ATIS task, our formal language is a minor variant of the NL-Parse (Hemphill, Godfrey, and Doddington, 1990) used by ARPA to annotate the ATIS corpus. An example of a formal and natural language pair is: * F: List flights from New Orleans to Memphis flying on Monday departing early_morning * E: do you have any flights going to Memphis leaving New Orleans early Monday morning Here, the evidence for the formal language concept 'early_morning' resides in the two disjoint clumps of English 'early' and 'morning'. In this paper, we introduce the notion of concept fertility into our translation models p(EIF ) to capture this effect and the more general linguistic phenomenon of embedded clauses. Basically, this entails augmenting the translation model with terms of the form p(nlf), where n is the number of clumps generated by the formal language word f. The resulting model can be trained automatically from a bilingual corpus of English and formal language sentence pairs.</Paragraph> <Paragraph position="4"> Other attempts at statistical NLU systems have used various meaning representations such as concepts in the AT&T system (Levin and Pieraccini, 1995) or initial semantic structure in the BBN system (Miller et al., 1995). Both of these systems require significant rule-based transformations to produce disambiguated interpretations which are then used to generate the SQL query for ATIS. More recently, BBN has replaced handwritten rules with decision trees (Miller et al., 1996). Moreover, both systems were trained using English annotated by hand with segmentation and labeling, and both systems produce a semantic representation which is forced to preserve the time order expressed in the English. Interestingly, both the AT&T and BBN systems generate words within a clump according to bigram models. Other statistical approachs to NLU include decision trees (Kuhn and Mori, 1995) and neural nets (Gorin et al., 1991).</Paragraph> <Paragraph position="5"> In earlier IBM translation systems (Brown et al., 1993) each English word would be generated by, or &quot;aligned to&quot;, exactly one formal language word. This mapping between the English and formal language expressions is called the &quot;alignment&quot;. In the simplest case, the translation model is simply proportional to the product of word-pair translation probabilities, one per element in the alignment. In these models, the alignment provides all of the structure in the translation model. The alignment is a &quot;hidden&quot; quantity which is not annotated in the training data and must be inferred indirectly. The EM algorithm (Dempster, Laird, and Rubin, 1977) used to train such &quot;hidden&quot; models requires us to sum an expression over all possible alignments.</Paragraph> <Paragraph position="6"> These early models were developed for French to English translation. However, in NLU there is a fundamental asymmetry between the natural language and the unambiguous formal language. Most notably, one formal language word may frequently correspond to whole English phrases. We added the &quot;clump&quot;, an extra layer of structure, to accomodate this phenomenon (Epstein et al., 1996). In this paradigm, formal language words first generate a clumping, or partition, of the word slots of the English expression. Then, each clump is filled in according to a translation model as before. The alignment is defined between the formal language words and the clumps. Then, both the alignment and the clumping are hidden structures which must be summed over to train the models.</Paragraph> <Paragraph position="7"> Already, these models represent significant progress. They learn automatically from a bilingual corpus of English and formal language sentences. They do not require linguistically knowledgeable experts to tediously annotate a training corpus. Rather, they rely upon a group of translators with significantly less linguistic knowledge to produce a bilingual training corpus. The fertility models introduced below maintain these benefits while slightly improving performance.</Paragraph> </Section> class="xml-element"></Paper>