File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/e06-1019_intro.xml
Size: 3,230 bytes
Last Modified: 2025-10-06 14:03:19
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-1019"> <Title>A Comparison of Syntactically Motivated Word Alignment Spaces</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Bilingual word alignment finds word-level correspondences between parallel sentences. The task originally emerged as an intermediate result of training the IBM translation models (Brown et al., 1993). These models use minimal linguistic intuitions; they essentially treat sentences as flat strings. They remain the dominant method for word alignment (Och and Ney, 2003). There have been several proposals to introduce syntax into wordalignment. Someworkwithintheframework of synchronous grammars (Wu, 1997; Melamed, 2003), while others create a generative story that includes a parse tree provided for one of the sentences (Yamada and Knight, 2001).</Paragraph> <Paragraph position="1"> There are three primary reasons toadd syntax to word alignment. First, one can incorporate syntactic features, such as grammar productions, into the models that guide the alignment search. Second, movement can be modeled more naturally; when a three-word noun phrase moves during translation, it can be modeled as one movement operation instead of three. Finally, one can restrict the type of movement that is considered, shrinking the number of alignments that are attempted. We investigate this last advantage of syntactic alignment. We fix an alignment scoring model that works equally well on flat strings as on parse trees, but we vary the space of alignments evaluated with that model.</Paragraph> <Paragraph position="2"> These spaces become smaller as more linguistic guidance is added. We measure the benefits and detriments of these constrained searches.</Paragraph> <Paragraph position="3"> Several of the spaces we investigate draw guidance from a dependency tree for one of the sentences. We will refer to the parsed language as English and the other as Foreign. Lin and Cherry (2003) have shown that adding a dependency-based cohesion constraint to an alignment search can improve alignment quality. Unfortunately, the usefulness of their beam search solution is limited: potential alignments are constructed explicitly, which prevents a perfect search of alignment space and the use of algorithms like EM. However, the cohesion constraint is based on a tree, which should make it amenable to dynamic programming solutions. To enable such techniques, we bring the cohesion constraint inside the ITG framework (Wu, 1997).</Paragraph> <Paragraph position="4"> Zhang and Gildea (2004) compared Yamada and Knight's (2001) tree-to-string alignment model to ITGs. They concluded that methods like ITGs, which create a tree during alignment, perform better than methods with a fixed tree established before alignment begins. However, the use of a fixed tree is not the only difference between (Yamada and Knight, 2001) and ITGs; the probability models are also very different. By using a fixed dependency tree inside an ITG, we can revisit the question of whether using a fixed tree is harmful, but in a controlled environment.</Paragraph> </Section> class="xml-element"></Paper>