File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-1039_concl.xml

Size: 1,966 bytes

Last Modified: 2025-10-06 13:53:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-1039">
  <Title>Phrasal Cohesion and Statistical Machine Translation</Title>
  <Section position="8" start_page="7" end_page="7" type="concl">
    <SectionTitle>
7 Conclusions
</SectionTitle>
    <Paragraph position="0"> We have examined the issue of phrasal cohesion between English and French and discovered that while there is less cohesion than we might desire, there is still a large amount of regularity in the constructions where breakdowns occur. This reassures us that re-ordering words by phrasal movement is a reasonable strategy. Many of the initially daunting number of crossings were due to non-linguistic reasons, such as rewording during translation or errors in syntactic analysis. Among the rest, there are a small number of syntactic constructions which result in the majority of the crossings examined in our analysis. One practical result of this skewed distribution is that one could hope to discover the major problem areas for a new language pair by manually aligning a small number of sentences. This information could be used to filter a training corpus to remove sentences which would cause problems in training the translation model, or for identifying areas to focus on when working to improve the model itself. We are interested in examining different language pairs as the opportunity arises.</Paragraph>
    <Paragraph position="1"> We have also examined the differences in cohesion between Treebank-style parse trees, trees with flattened verb phrases, and dependency structures.</Paragraph>
    <Paragraph position="2"> Our results indicate that the highest degree of cohesion is present in dependency structures. Therefore, in an SMT system which is using some type of phrasal movement during reordering, dependency structures should produce better results than raw parse trees. In the future, we plan to explore this hypothesis in an actual translation system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML