XML Viewer - p06-1067

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-1067_evalu.xml
Size: 8,071 bytes
Last Modified: 2025-10-06 13:59:37
<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1067">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Distortion Models For Statistical Machine Translation</Title>
  <Section position="7" start_page="531" end_page="533" type="evalu">
    <SectionTitle>
5 Experimental Results
</SectionTitle>
    <Paragraph position="0"> The phrase-based decoder we use is inspired by the decoder described in (Tillmann and Ney, 2003) and similar to that described in (Koehn, 2004). It is a multistack, multi-beam search decoder with n stacks (where n is the length of the source sentence being decoded)  The input is the reordered English in the reference. The 95% Confidence s ranges from 0.011 to 0.016 and a beam associated with each stack as described in (Al-Onaizan, 2005). The search is done in n time steps. In time step i, only hypotheses that cover exactly i source words are extended. The beam search algorithm attempts to find the translation (i.e., hypothesis that covers all source words) with the minimum cost as in (Tillmann and Ney, 2003) and (Koehn, 2004) . The distortion cost is added to the log-linear mixture of the hypothesis extension in a fashion similar to the language model cost.</Paragraph>
    <Paragraph position="1"> A hypothesis covers a subset of the source words.</Paragraph>
    <Paragraph position="2"> The final translation is a hypothesis that covers all source words and has the minimum cost among all possible 9 hypotheses that cover all source words. A hypothesis h is extended by matching the phrase dictionary against source word sequences in the input sentence that are not covered in h. The cost of the new hypothesis C(hnew) = C(h) + C(e), where C(e) is the cost of this extension. The main components of the cost of extension e can be defined by the following equation:</Paragraph>
    <Paragraph position="4"> where CLM (e) is the language model cost, CTM (e) is the translation model cost, and CD(e) is the distortion cost. The extension cost depends on the hypothesis being extended, the phrase being used in the extension, and the source word positions being covered.</Paragraph>
    <Paragraph position="5"> The word reorderings that are explored by the search algorithm are controlled by two parameters s and w as described in (Tillmann and Ney, 2003). The first parameter s denotes the number of source words that are temporarily skipped (i.e., temporarily left uncovered) during the search to cover a source word to the right of the skipped words. The second parameter is the window width w, which is defined as the distance (in number of source words) between the left-most uncovered source word and the right-most covered source word.</Paragraph>
    <Paragraph position="6"> To illustrate these restrictions, let us assume the input sentence consists of the following sequence (f1, f2, f3, f4). For s=1 and w=2, the permissible permutations are (f1, f2, f3, f4), (f2, f1, f3, f4), 9Exploring all possible hypothesis with all possible word permutations is computationally intractable. Therefore, the search algorithm gives an approximation to the optimal solution. All possible hypotheses refers to all hypotheses that were explored by the decoder.</Paragraph>
    <Paragraph position="7"> (f2, f3, f1, f4), (f1, f3, f2, f4),(f1, f3, f4, f2), and (f1, f2, f4, f3).</Paragraph>
    <Section position="1" start_page="532" end_page="532" type="sub_section">
      <SectionTitle>
5.1 Experimental Setup
</SectionTitle>
      <Paragraph position="0"> The experiments reported in this section are in the context of SMT from Arabic into English. The training data is a 500K sentence-pairs subsample of the 2005 Large Track Arabic-English Data for NIST MT Evaluation. null The language model used is an interpolated trigram model described in (Bahl et al., 1983). The language model is trained on the LDC English GigaWord Corpus. null The test set used in the experiments in this section is the 2003 NIST MT Evaluation test set (which is not part of the training data).</Paragraph>
    </Section>
    <Section position="2" start_page="532" end_page="532" type="sub_section">
      <SectionTitle>
5.2 Reordering with Perfect Translations
</SectionTitle>
      <Paragraph position="0"> In the experiments in this section, we show the utility of a trigram language model in restoring the correct word order for English. The task is a simplified translation task, where the input is reordered English (English written in Arabic word order) and the output is English in the correct order. The source sentence is a reordered English sentence in the same manner we described in Section 3. The objective of the decoder is to recover the correct English order.</Paragraph>
      <Paragraph position="1"> We use the same phrase-based decoder we use for our SMT experiments, except that only the language model cost is used here. Also, the phrase dictionary used is a one-to-one function that maps every English word in our vocabulary to itself. The language model we use for the experiments reported here is the same as the one used for other experiments reported in this paper.</Paragraph>
      <Paragraph position="2"> The results in Table 3 illustrate how the language model performs reasonably well for local reorderings (e.g., for s = 3 and w = 4), but its perfromance deteriorates as we relax the reordering restrictions by increasing the reordering window size (w).</Paragraph>
      <Paragraph position="3"> Table 4 shows some examples of original English, English in Arabic order, and the decoder output for two different sets of reordering parameters.</Paragraph>
    </Section>
    <Section position="3" start_page="532" end_page="533" type="sub_section">
      <SectionTitle>
5.3 SMT Experiments
</SectionTitle>
      <Paragraph position="0"> The phrases in the phrase dictionary we use in the experiments reported here are a combination  English in its original order (Orig. Eng.) and decoding with two different parameter settings. Output1 is decoding with (s=3,w=4). Output2 is decoding with (s=4,w=12). The sentence lengths of the examples presented here are much shorter than the average in our test set ( 28.5).</Paragraph>
      <Paragraph position="1">  to 0.0176. s is the number of words temporarily skipped, and w is the word permutation window size. of phrases automatically extracted from maximum-posterior alignments and maximum entropy alignments. Only phrases that conform to the so-called consistent alignment restrictions (Och et al., 1999) are extracted. null Table 5 shows BLEU scores for our SMT decoder with different parameter settings for skip s, window width w, with and without our distortion model. The BLEU scores reported in this table are based on 4 reference translations. The language model, phrase dictionary, and other decoder tuning parameters remain the same in all experiments reported in this table.</Paragraph>
      <Paragraph position="2"> Table 5 clearly shows that as we open the search and consider wider range of word reorderings, the BLEU score decreases in the absence of our distortion model when we rely solely on the language model. Wrong reorderings look attractive to the decoder via the language model which suggests that we need a richer model with more parameter. In the absence of richer models such as the proposed distortion model, our results suggest that it is best to decode monotonically and only allow local reorderings that are captured in our phrase dictionary.</Paragraph>
      <Paragraph position="3"> However, when the distortion model is used, we see statistically significant increases in the BLEU score as we consider more word reorderings. The best BLEU score achieved when using the distortion model is 0.4792 , compared to a best BLEU score of 0.4468 when the distortion model is not used.</Paragraph>
      <Paragraph position="4"> Our results on the 2004 and 2005 NIST MT Evaluation test sets using the distortion model are 0.4497 and 0.464610, respectively.</Paragraph>
      <Paragraph position="5"> Table 6 shows some Arabic-English translation examples using our decoder with and without the distortion model.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML