File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/c00-2163_intro.xml
Size: 2,494 bytes
Last Modified: 2025-10-06 14:00:54
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-2163"> <Title>A Comparison of Alignment Models for Statistical Machine Translation</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> In statistical machine translation (SMT) it is necessm'y to model the translation probability Pr(fl a Ic~).</Paragraph> <Paragraph position="1"> Here .fi' = f denotes tile (15'ench) source and e{ = e denotes the (English) target string. Most SMT models (Brown et al., 1993; Vogel et al., 1996) try to model word-to-word corresl)ondences between source and target words using an alignment nmpl)ing from source l)osition j to target position i = aj.</Paragraph> <Paragraph position="2"> We can rewrite tim t)robal)ility Pr(fille~) t) 3, introducing the 'hidden' alignments ai 1 := al ...aj...a.l</Paragraph> <Paragraph position="4"> To allow fbr French words wlfich do not directly correspond to any English word an artificial 'empty' word c0 is added to the target sentence at position i=0.</Paragraph> <Paragraph position="5"> The different alignment models we present provide different decoInt)ositions of Pr(f~,a~le(). An alignnlent 5~ for which holds</Paragraph> <Paragraph position="7"> for a specific model is called Viterbi alignment of&quot; this model.</Paragraph> <Paragraph position="8"> In this paper we will describe extensions to tile Hidden-Markov alignment model froln (Vogel et al., 1.996) and compare tlmse to Models 1 - 4 of (Brown et al., 1993). We t)roI)ose to measure the quality of an alignment nlodel using the quality of tlle Viterbi alignment compared to a manually-produced alignment. This has the advantage that once having produced a reference alignlnent, the evaluation itself can be performed automatically. In addition, it results in a very precise and relia.ble evaluation criterion which is well suited to assess various design decisions in modeling and training of statistical alignment models. null It, is well known that manually pertbrming a word aligmnent is a COlnplicated and ambiguous task (Melamed, 1998). Therefore, to produce tlle reference alignment we use a relined annotation scheme which reduces the complications and mnbiguities occurring in the immual construction of a word alignment. As we use tile alignment models for machine translation purposes, we also evahlate the resulting translation quality of different nlodels.</Paragraph> </Section> class="xml-element"></Paper>