File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/p97-1047_intro.xml

Size: 4,924 bytes

Last Modified: 2025-10-06 14:06:15

<?xml version="1.0" standalone="yes"?>
<Paper uid="P97-1047">
  <Title>Decoding Algorithm in Statistical Machine Translation</Title>
  <Section position="3" start_page="0" end_page="366" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="366" type="sub_section">
      <SectionTitle>
1.1 Statistical Machine Translation
</SectionTitle>
      <Paragraph position="0"> Statistical machine translation is based on a channel model. Given a sentence T in one language (German) to be translated into another language (English), it considers T as the target of a communication channel, and its translation S as the source of the channel. Hence the machine translation task becomes to recover the source from the target. Basically every English sentence is a possible source for a German target sentence. If we assign a probability P(S I T) to each pair of sentences (S, T), then the problem of translation is to find the source S for a given target T, such that P(S \[ T) is the maximum.</Paragraph>
      <Paragraph position="1"> According to Bayes rule,</Paragraph>
      <Paragraph position="3"> Since the denominator is independent of S, we have</Paragraph>
      <Paragraph position="5"> Therefore a statistical machine translation system must deal with the following three problems: * Modeling Problem: How to depict the process of generating a sentence in a source language, and the process used by a channel to generate a target sentence upon receiving a source sentence? The former is the problem of language modeling, and the later is the problem of translation modeling. They provide a framework for calculating P(S) and P(W I S) in (2).</Paragraph>
      <Paragraph position="6"> * Learning Problem: Given a statistical language model P(S) and a statistical translation model P(T I S), how to estimate the parameters in these models from a bilingual corpus of sentences? null * Decoding Problem: With a fully specified (framework and parameters) language and translation model, given a target sentence T, how to efficiently search for the source sentence that satisfies (2).</Paragraph>
      <Paragraph position="7"> The modeling and learning issues have been discussed in (Brown et ah, 1993), where ngram model was used for language modeling, and five different translation models were introduced for the translation process. We briefly introduce the model 2 here, for which we built our decoder.</Paragraph>
      <Paragraph position="8"> In model 2, upon receiving a source English sentence e = el,. * -, el, the channel generates a German sentence g = gl, * * &amp;quot;, g,n at the target end in the fol- null lowing way: 1. With a distribution P(m I e), randomly choose the length m of the German translation g. In model 2, the distribution is independent of m and e:</Paragraph>
      <Paragraph position="10"> where e is a small, fixed number.</Paragraph>
      <Paragraph position="11"> 2. For each position i (0 &lt; i &lt; m) in g, find the corresponding position ai in e according to an alignment distribution P(ai I i, a~ -1, m, e). In model 2, the distribution only depends on i, ai and the length of the English and German sentences: null P(ai l i, a~-l,m,e) = a(ai l i, m,l)  3. Generate the word gl at the position i of the German sentence from the English word ea~ at  the aligned position ai of gi, according to a translation distribution P(gi t ~t~'~, st~i-t, e) = t(gl I ea~). The distribution here only depends on gi and eai.</Paragraph>
      <Paragraph position="12"> Therefore, P(g l e) is the sum of the probabilities of generating g from e over all possible alignments A, in which the position i in the target sentence g is aligned to the position ai in the source sentence e:</Paragraph>
      <Paragraph position="14"> (Brown et al., 1993) also described how to use the EM algorithm to estimate the parameters a(i I j,l, m) and $(g I e) in the aforementioned model.</Paragraph>
    </Section>
    <Section position="2" start_page="366" end_page="366" type="sub_section">
      <SectionTitle>
1.2 Decoding in Statistical Machine
Translation
</SectionTitle>
      <Paragraph position="0"> (Brown et al., 1993) and (Vogel, Ney, and Tillman, 1996) have discussed the first two of the three problems in statistical machine translation. Although the authors of (Brown et al., 1993) stated that they would discuss the search problem in a follow-up arti* cle, so far there have no publications devoted to the decoding issue for statistical machine translation.</Paragraph>
      <Paragraph position="1"> On the other side, decoding algorithm is a crucial part in statistical machine translation. Its performance directly affects the quality and efficiency of translation. Without a good and efficient decoding algorithm, a statistical machine translation system may miss the best translation of an input sentence even if it is perfectly predicted by the model.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML