File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1066_intro.xml

Size: 2,844 bytes

Last Modified: 2025-10-06 14:02:23

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1066">
  <Title>Improving IBM Word-Alignment Model 1</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Definition of Model 1
</SectionTitle>
    <Paragraph position="0"> Model 1 is a probabilistic generative model within a framework that assumes a source sentence S of length l translates as a target sentence T, according to the following stochastic process:  * A length m for sentence T is generated.</Paragraph>
    <Paragraph position="1"> * For each target sentence position j [?] {1,...,m}: - A generating word s i in S (including a null word s 0 ) is selected, and - The target word t j at position j is generated depending on s</Paragraph>
    <Paragraph position="3"> Model 1 is defined as a particularly simple instance of this framework, by assuming all possible lengths for T (less than some arbitrary upper bound) have a uniform probability epsilon1, all possible choices of source sentence generating words are equally likely, and the translation probability tr(t</Paragraph>
    <Paragraph position="5"> ) of the generated target language word depends only on the generating source language word--which Brown et al. (1993a) show yields the following equation:</Paragraph>
    <Paragraph position="7"> Equation 1 gives the Model 1 estimate for the probability of a target sentence, given a source sentence. We may also be interested in the question of what is the most likely alignment of a source sentence and a target sentence, given an instance of Model 1; where, by an alignment, we mean a specification of which source words generated which target words according to the generative model. Since Model 1, like many other word-alignment models, requires each target word to be generated by exactly one source word (including the null word), an alignment a can be represented by a vector a</Paragraph>
    <Paragraph position="9"> is the sentence position of the source word generating t j according to the alignment. It is easy to show that for Model 1, the most likely alignment ^a of S and T is given by this equation:</Paragraph>
    <Paragraph position="11"> Since in applying Model 1, there are no dependencies between any of the a j s, we can find the most likely aligment simply by choosing, for each j, the value for a j that leads to the highest value for</Paragraph>
    <Paragraph position="13"> The parameters of Model 1 for a given pair of languages are normally estimated using EM, taking as training data a corpus of paired sentences of the two languages, such that each pair consists of sentence in one language and a possible translation in the other language. The training is normally initialized by setting all translation probability distributions to the uniform distribution over the target language vocabulary.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML