File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-1122_evalu.xml

Size: 4,424 bytes

Last Modified: 2025-10-06 13:59:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1122">
  <Title>Modelling lexical redundancy for machine translation</Title>
  <Section position="7" start_page="973" end_page="974" type="evalu">
    <SectionTitle>
5 Results
</SectionTitle>
    <Paragraph position="0"> Table 2 shows the changes in BLEU when we incorporate the lexicon mappings during the word-alignment process. The standard SMT lexicon model is not optimal, as measured by BLEU, for any of the languages or training set sizes considered. Increases over this baseline, however, diminish with more training data. For both Czech and Welsh, the explicit model selection procedure that we have proposed results in better translations than all of the baseline models when the MRF prior is used; again these increases diminish with larger training sets. We note that the stemming baseline models appear to be more effective for Czech than for Welsh. The impact of the MRF prior is also greater for smaller training sets.</Paragraph>
    <Paragraph position="1"> Table 3 shows the results of using these models to smooth the phrase translation table.5 With the exception of Czech, the improvements are smaller than for Exp 1. For all source languages and models we found that it was optimal to leave the target lexicon unmapped when smoothing the phrase translation model.</Paragraph>
    <Paragraph position="2"> Using lemmatize for word-alignment on the Czech corpus gave BLEU scores of 32.71 and 37.21 for the 10K and 21K training sets respectively; usedtosmooththephrasetranslationmodel it gave scores of 33.96 and 37.18.</Paragraph>
    <Section position="1" start_page="973" end_page="974" type="sub_section">
      <SectionTitle>
5.1 Discussion
</SectionTitle>
      <Paragraph position="0"> Model selection had the largest impact for smaller data sets suggesting that the complexity of the standard model is most excessive in sparse data conditions. The larger improvements seen for Czech and Welsh suggest that these languages encode more redundant information in the lexicon with respect to English. Potential sources could be grammatical case markings (Czech) and mutation patterns (Welsh). The impact of the MRF prior for smaller data sets suggests it overcomes sparsity in the bilingual statistics during model selection.</Paragraph>
      <Paragraph position="1"> The location of redundancies, in the form of case markings, at the ends of words in Czech as assumed by the stemming algorithms may explain why these performed better on this language than  Src ehangu o ffilm i deledu.</Paragraph>
      <Paragraph position="2"> Ref an expansion from film into television.</Paragraph>
      <Paragraph position="3"> standard expansion of footage to deledu.</Paragraph>
      <Paragraph position="4"> max-pref expansion of ffilm to television.</Paragraph>
      <Paragraph position="5"> src+mrf expansion of film to television.</Paragraph>
      <Paragraph position="6"> Src yw gwarchod cymru fel gwlad brydferth Ref safeguarding wales as a picturesque country standard protection of wales as a country brydferth max-pref protection of wales as a country brydferth src+mrf protecting wales as a beautiful country Src cynhyrchu canlyniadau llai na pherffaith Ref produces results that are less than perfect standard produce results less than pherffaith max-pref produce results less than pherffaith src+mrf generates less than perfect results Src y dynodiad o graidd y broblem Ref the identification of the nub of the problem standard the dynodiad of the heart of the problem max-pref the dynodiad of the heart of the problem src+mrf the identified crux of the problem on Welsh. The highest scoring features in the MRF (see Table 5) show that Welsh redundancies, on the other hand, are primarily between initial characters. Inspection of system output confirms that OOV types could be mapped to known Welsh words with the MRF prior but not via stemming (see Table 4). For each language pair the MRF learned features that capture intuitively redundant patterns: adjectivalendingsforFrench, casemarkings for Czech, and mutation patterns for Welsh. The greater improvements in Exp. 1 were mirrored by higher compression rates for these lexicons (see Table. 6) supporting the conjecture thatword-alignmentrequireslessinformationthan full-blown translation. The results of the lemma- null  tizemodelonCzechshowthemodelselectionprocedure improving on a simple supervised baseline.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML