XML Viewer - w06-3126

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/06/w06-3126_relat.xml
Size: 3,172 bytes
Last Modified: 2025-10-06 14:15:58
<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3126">
  <Title>The LDV-COMBO system for SMT</Title>
  <Section position="4" start_page="166" end_page="167" type="relat">
    <SectionTitle>
3 Experimental Work
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="166" end_page="167" type="sub_section">
      <SectionTitle>
3.1 Setting
</SectionTitle>
      <Paragraph position="0"> We have used only the data sets and language model provided by the organization. For evaluation we have selected a set of 8 metric variants corresponding to seven different families: BLEU (n = 4) (Papineni et al., 2001), NIST (n = 5) (Lin and Hovy, 2002), GTM F1-measure (e = 1,2) (Melamed et al., 2003), 1-WER (Niessen et al., 2000), 1-PER (Leusch et al., 2003), ROUGE (ROUGE-S*) (Lin and Och, 2004) and METEOR3 (Banerjee and Lavie, 2005).</Paragraph>
      <Paragraph position="1"> Optimization of the decoding parameters (ltm, llm, lw) is performed by means of the Downhill Simplex Method in Multidimensions (William H. Press and Flannery, 2002) over the BLEU metric.</Paragraph>
      <Paragraph position="2"> 3For Spanish-to-English we applied all available modules: exact + stemming + WordNet stemming + WordNet synonymy lookup. However, for English-to-Spanish we were forced to use the exact module alone.</Paragraph>
    </Section>
    <Section position="2" start_page="167" end_page="167" type="sub_section">
      <SectionTitle>
3.2 Results
</SectionTitle>
      <Paragraph position="0"> Table 1 presents MT results for the test set both for the Spanish-to-English and English-to-Spanish tasks. The variant of the LDV-COMBO system described in Section 2 is compared to a baseline variant based only on lexical items. In the case of Spanish-to-English performance varies from metric to metric. Therefore, an open issue is which metric should be trusted. In any case, the differences are minor. However, in the case of English-to-Spanish all metrics but '1-WER' agree to indicate that the LDV-COMBO system significantly outperforms the baseline. We suspect this may be due to the richer morphology of Spanish. In order to test this hypothesis we performed an error analysys at the sentence level based on the GTM F-measure. We found many cases where the LDV-COMBO system outperforms the baseline system by choosing a more accurate translation. For instance, in Table 2 we may see a fragment of the case of sentence 2176 in the test set. A better translation for &amp;quot;consider&amp;quot; is provided, &amp;quot;pensemos&amp;quot;, which corresponds to the right verb and verbal form (instead of &amp;quot;estiman&amp;quot;). By inspecting translation models we confirmed the better adjustment of probabilities.</Paragraph>
      <Paragraph position="1"> Interestingly, LDV-COMBO translation models are between 30% and 40% smaller than the models based on lexical items alone. The reason is that we are working with the union of alignments from different data views, thus adding more constraints into the phrase extraction step. Fewer phrase pairs are extracted, and as a consequence we are also effectively eliminating noise from translation models.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML