File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/p06-1067_concl.xml

Size: 2,225 bytes

Last Modified: 2025-10-06 13:55:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1067">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Distortion Models For Statistical Machine Translation</Title>
  <Section position="8" start_page="533" end_page="534" type="concl">
    <SectionTitle>
6 Conclusion and Future Work
</SectionTitle>
    <Paragraph position="0"> We presented a new distortion model that can be integrated with existing phrase-based SMT decoders.</Paragraph>
    <Paragraph position="1"> The proposed model shows statistically significant improvement over a state-of-the-art phrase-based SMT decoder. We also showed that n-gram language mod10The MT05 BLEU score is the from the official NIST evaluation. The MT04 BLEU score is only our second run on MT04.</Paragraph>
    <Paragraph position="2">  lations. Output 1 is decoding without the distortion model and (s=4, w=8), which corresponds to 0.4104 BLEU score. Output 2 is decoding with the distortion model and (s=3, w=8), which corresponds to 0.4792 BLEU score. The sentences presented here are much shorter than the average in our test set. The average length of the arabic sentence in the MT03 test set is 24.7.</Paragraph>
    <Paragraph position="3"> els are not sufficient to model word movement in translation. Our proposed distortion model addresses this weakness of the n-gram language model.</Paragraph>
    <Paragraph position="4"> We also propose a novel metric to measure word order similarity (or differences) between any pair of languages based on word alignments. Our metric shows that Chinese-English have a closer word order than Arabic-English.</Paragraph>
    <Paragraph position="5"> Our proposed distortion model relies solely on word alignments and is conditioned on the source words.</Paragraph>
    <Paragraph position="6"> The majority of word movement in translation is mainly due to syntactic differences between the source and target language. For example, Arabic is verb-initial for the most part. So, when translating into English, one needs to move the verb after the subject, which is often a long compounded phrase. Therefore, we would like to incorporate syntactic or part-of-speech information in our distortion model.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML