XML Viewer - w05-0822

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/w05-0822_metho.xml
Size: 6,054 bytes
Last Modified: 2025-10-06 14:10:01
<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0822">
  <Title>PORTAGE: A Phrase-based Machine Translation System</Title>
  <Section position="5" start_page="129" end_page="131" type="metho">
    <SectionTitle>
3 Experiments on the Shared Task
</SectionTitle>
    <Paragraph position="0"> We conducted experiments and evaluations on Portage using the diffe re ask. The training data was provided for the ask as follows: Training data of 688,031 sentences in French and English. A similarly sized cor- null pus is provided for Finnish, Spanish and German with matched English translations. orpus was used to generate both lan e translations into English, was Portage for a comparative study exploiting and combining different resources and  tec 3. arl corpus 4.</Paragraph>
    <Paragraph position="1"> rd corpora as training data and t mod est p icipation at th h-English tas 9.53. od D Decoding+Rescoring - Development test data of 2,000 sentences in  the four languages.</Paragraph>
    <Paragraph position="2"> In addition to the provided data, a set of 6,056,014 sentences extracted from Hansard corpus, the official record of Canada's parliamentary debates, was used in both French and English languages. This c guage and translation models for use in decoding and rescoring.</Paragraph>
    <Paragraph position="3"> The development test data was split into two parts: The first part that includes 1,000 sentences in each language with reference translations into English served in the optimization of weights for both the decoding and rescoring models. In this study, number of n-best lists was set to 1,000. The second part, which includes 1,000 sentences in each language with referenc used in the evaluation of the performance of the translation models.</Paragraph>
    <Section position="1" start_page="130" end_page="130" type="sub_section">
      <SectionTitle>
3.1 Experiments on the French-English Task
</SectionTitle>
      <Paragraph position="0"> Our goal for this language pair was to conduct experiments on hniques:  1. Method E is based on the Europarl corpus as training data, 2. Method E-H is based on both Europarl and  Hansard corpora as training data, Method E-p is based on the Europ as training data and parsing numbers and dates in the preprocessing phase, Method E-H-p is based on both Europarl and Hansa parsing numbers and date in the preprocessing phase.</Paragraph>
      <Paragraph position="1"> Results are shown in Table 1 for the French-English task. The first column of Table 1 indicates the method, the second column gives results for decoding with Canoe only, and the third column for decoding and rescoring with Canoe. For comparison between the four methods, there was an improvement in terms of BLEU scores when using two language models and two translation models generated from Europarl and Hansard corpora; however, parsing numbers and dates had a negative impact on the ranslation els. The b BLEU score for our  of increased trade within North merica but also functions as a good counterpoint for French-English.</Paragraph>
      <Paragraph position="2"> ble 1. BLEU scores for the French-English test sentences A noteworthy feature of these results is that the improvement given by the out-of-domain Hansard corpus was very slight. Although we suspect that somewhat better performance could have been achieved by better weight optimization, this result clearly underscores the importance of matching training and test domains. A related point is that our number and date translation rules actually caused a performance drop due to the fact that they were optimized for typographical conventions prevalent in Hansard, which are quite different from those used in Europarl.</Paragraph>
      <Paragraph position="3"> Our best result ranked third in the shared WPT05 French-English task , with a difference of 0.74 in terms of BLEU score from the first rank participant, and a difference of 0.67 in terms o BLEU score from the second ranked participant.</Paragraph>
    </Section>
    <Section position="2" start_page="130" end_page="131" type="sub_section">
      <SectionTitle>
3.2 Experiments on other Pairs of Languages
</SectionTitle>
      <Paragraph position="0"> The WPT05 workshop provides a good opportunity to achieve our benchmarking goals with corpora that provide challenging difficulties. German and Finnish are languages that make considerable use of compounding. Finnish, in addition, has a particularly complex morphology that is organized on principles that are quite different from any in English. This results in much longer word forms each of which occurs very infrequently.</Paragraph>
      <Paragraph position="1"> Our original intent was to propose a number of possible statistical approaches to analyzing and splitting these word forms and improving our results.</Paragraph>
      <Paragraph position="2"> Since none of these yielded results as good as the baseline, we will continue this work until we understand what is really needed. We also care very much about translating between French and English in Canada and plan to spend a lot of extra effort on difficulties that occur in this case. Translation between Spanish and English is also becoming more mportant as a result i A  Models. In Proceedings of the Association for Machine Translation in the Americas AMTA 2004.</Paragraph>
      <Paragraph position="3"> ble 2 BLEU scores for the Finnish-English, German-English and Spanish-English test sentences To establish our baseline, the only preprocessing we did was lowercasing (using the provided tokenization). Canoe was run without any special settings, although weights for distortion, word penalty, language model, and translation model were optimized using a grid search, as described above. Rescoring was also done, and usually resulted in at least an extra BLEU point.</Paragraph>
      <Paragraph position="4"> Our final results are shown in Table 2. Ranks at the shared WPT05 Finnish-, German-, and Spanish-English tasks were assigned as second, third and fourth, with differences of 1.06, 1.87 ter s of BLEU sc</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML