File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/p04-1023_relat.xml
Size: 1,634 bytes
Last Modified: 2025-10-06 14:15:46
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-1023"> <Title>Statistical Machine Translation with Wordand Sentence-Aligned Parallel Corpora</Title> <Section position="7" start_page="0" end_page="0" type="relat"> <SectionTitle> 6 Related Work </SectionTitle> <Paragraph position="0"> Och and Ney (2003) is the most extensive analysis to date of how many different factors contribute towards improved alignments error rates, but the inclusion of word-alignments is not considered. Och and Ney do not give any direct analysis of how improved word alignments accuracy contributes toward better translation quality as we do here.</Paragraph> <Paragraph position="1"> Mihalcea and Pedersen (2003) described a shared task where the goal was to achieve the best AER. A number of different methods were tried, but none of them used word-level alignments. Since the best performing system used an unmodified version of Giza++, we would expected that our modifed version would show enhanced performance. Naturally this would need to be tested in future work.</Paragraph> <Paragraph position="2"> Melamed (1998) describes the process of manually creating a large set of word-level alignments of sentences in a parallel text.</Paragraph> <Paragraph position="3"> Nigam et al. (2000) described the use of weight to balance the respective contributions of labeled and unlabeled data to a mixed likelihood function.</Paragraph> <Paragraph position="4"> Corduneanu (2002) provides a detailed discussion of the instability of maximum likelhood solutions estimated from a mixture of labeled and unlabeled data.</Paragraph> </Section> class="xml-element"></Paper>