File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/p04-1023_relat.xml

Size: 1,634 bytes

Last Modified: 2025-10-06 14:15:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1023">
  <Title>Statistical Machine Translation with Wordand Sentence-Aligned Parallel Corpora</Title>
  <Section position="7" start_page="0" end_page="0" type="relat">
    <SectionTitle>
6 Related Work
</SectionTitle>
    <Paragraph position="0"> Och and Ney (2003) is the most extensive analysis to date of how many different factors contribute towards improved alignments error rates, but the inclusion of word-alignments is not considered. Och and Ney do not give any direct analysis of how improved word alignments accuracy contributes toward better translation quality as we do here.</Paragraph>
    <Paragraph position="1"> Mihalcea and Pedersen (2003) described a shared task where the goal was to achieve the best AER. A number of different methods were tried, but none of them used word-level alignments. Since the best performing system used an unmodified version of Giza++, we would expected that our modifed version would show enhanced performance. Naturally this would need to be tested in future work.</Paragraph>
    <Paragraph position="2"> Melamed (1998) describes the process of manually creating a large set of word-level alignments of sentences in a parallel text.</Paragraph>
    <Paragraph position="3"> Nigam et al. (2000) described the use of weight to balance the respective contributions of labeled and unlabeled data to a mixed likelihood function.</Paragraph>
    <Paragraph position="4"> Corduneanu (2002) provides a detailed discussion of the instability of maximum likelhood solutions estimated from a mixture of labeled and unlabeled data.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML