File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1032_intro.xml
Size: 2,161 bytes
Last Modified: 2025-10-06 14:02:07
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1032"> <Title>Symmetric Word Alignments for Statistical Machine Translation</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Word-aligned bilingual corpora provide important knowledge for many natural language processing tasks, such as the extraction of bilingual word or phrase lexica (Melamed, 2000; Och and Ney, 2000). The solutions of these problems depend heavily on the quality of the word alignment (Och and Ney, 2000).</Paragraph> <Paragraph position="1"> Word alignment models were first introduced in statistical machine translation (Brown et al., 1993). An alignment describes a mapping from source sentence words to target sentence words.</Paragraph> <Paragraph position="2"> Using the IBM translation models IBM-1 to IBM-5 (Brown et al., 1993), as well as the Hidden-Markov alignment model (Vogel et al., 1996), we can produce alignments of good quality. However, all these models constrain the alignments so that a source word can be aligned to at most one target word. This constraint is useful to reduce the computational complexity of the model training, but makes it hard to align phrases in the target language (English) such as 'the day after tomorrow' to one word in the source language (German) '&quot;ubermorgen'. We will present a word alignment algorithm which avoids this constraint and produces symmetric word alignments. This algorithm considers the alignment problem as a task of finding the edge cover with minimal costs in a bipartite graph.</Paragraph> <Paragraph position="3"> The parameters of the IBM models and HMM, in particular the state occupation probabilities, will be used to determine the costs of aligning a specific source word to a target word.</Paragraph> <Paragraph position="4"> We will evaluate the suggested alignment methods on the German-English Verbmobil task and the French-English Canadian Hansards task. We will show statistically significant improvements compared to state-of-the-art results in (Och and Ney, 2003).</Paragraph> </Section> class="xml-element"></Paper>