File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/n06-1013_intro.xml
Size: 3,094 bytes
Last Modified: 2025-10-06 14:03:25
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-1013"> <Title>A Maximum Entropy Approach to Combining Word Alignments</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Word alignment--detection of corresponding words between two sentences that are translations of each other--is usually an intermediate step of statistical machine translation (MT) (Brown et al., 1993; Och and Ney, 2003; Koehn et al., 2003), but also has been shown useful for other applications such as construction of bilingual lexicons, word-sense disambiguation, projection of resources, and cross-language information retrieval.</Paragraph> <Paragraph position="1"> Maximum entropy (ME) models have been used in bilingual sense disambiguation, word reordering, and sentence segmentation (Berger et al., 1996), parsing, POS tagging and PP attachment (Ratnaparkhi, 1998), machine translation (Och and Ney, 2002), and FrameNet classification (Fleischman et al., 2003). They have also been used to solve the word alignment problem (Garcia-Varea et al., 2002; Ittycheriah and Roukos, 2005; Liu et al., 2005), but a sentence-level approach to combining knowledge sources is used rather than a word-level approach.</Paragraph> <Paragraph position="2"> This paper describes an approach to combining evidence from alignments generated by existing systems to obtain an alignment that is closer to the true alignment than the individual alignments. The alignment-combination approach (called ACME) operates at the level of alignment links, rather than at the sentence level (as in previous ME approaches). ACME uses ME to decide whether to include/exclude a particular alignment link based on feature functions that are extracted from the input alignments and linguistic features of the words.</Paragraph> <Paragraph position="3"> Since alignment combination relies on evidence from existing alignments, we focus on alignment links that exist in at least one input alignment. An important challenge in this approach is the selection of appropriate links when two aligners make different alignment choices.</Paragraph> <Paragraph position="4"> We show that ACME yields a significant relative error reduction over the input alignment systems and heuristic-based combinations on three different language pairs. Using a higher number of input alignments and partitioning the training data into disjoint subsets yield further error-rate reductions.</Paragraph> <Paragraph position="5"> The next section briefly overviews ME models.</Paragraph> <Paragraph position="6"> Section 3 presents a new ME approach to combining existing word alignment systems. Section 4 describes the evaluation data, input alignments, and evaluation metrics. Section 5 presents experiments on three language pairs, upper bounds for alignment error rate in alignment combination, and MT evaluation on English-Chinese and English-Arabic. Section 6 describes previous work on alignment combination and ME models on word alignment.</Paragraph> </Section> class="xml-element"></Paper>