File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1077_intro.xml
Size: 4,612 bytes
Last Modified: 2025-10-06 14:03:35
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1077"> <Title>Tree-to-String Alignment Template for Statistical Machine Translation</Title> <Section position="3" start_page="0" end_page="609" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Phrase-based translation models (Marcu and Wong, 2002; Koehn et al., 2003; Och and Ney, 2004), which go beyond the original IBM translation models (Brown et al., 1993) 1 by modeling translations of phrases rather than individual words, have been suggested to be the state-of-the-art in statistical machine translation by empirical evaluations.</Paragraph> <Paragraph position="1"> In phrase-based models, phrases are usually strings of adjacent words instead of syntactic constituents, excelling at capturing local reordering and performing translations that are localized to from that paper: a source string fJ1 = f1,...,fj,...,fJ is to be translated into a target string eI1 = e1,...,ei,...,eI. Here, I is the length of the target string, and J is the length of the source string.</Paragraph> <Paragraph position="2"> substrings that are common enough to be observed on training data. However, a key limitation of phrase-based models is that they fail to model re-ordering at the phrase level robustly. Typically, phrase reordering is modeled in terms of offset positions at the word level (Koehn, 2004; Och and Ney, 2004), making little or no direct use of syntactic information.</Paragraph> <Paragraph position="3"> Recent research on statistical machine translation has lead to the development of syntax-based models. Wu (1997) proposes Inversion Transduction Grammars, treating translation as a process of parallel parsing of the source and target language via a synchronized grammar. Alshawi et al. (2000) represent each production in parallel dependency tree as a finite transducer.</Paragraph> <Paragraph position="4"> Melamed (2004) formalizes machine translation problem as synchronous parsing based on multitext grammars. Graehl and Knight (2004) describe training and decoding algorithms for both generalized tree-to-tree and tree-to-string transducers. Chiang (2005) presents a hierarchical phrase-based model that uses hierarchical phrase pairs, which are formally productions of a synchronous context-free grammar. Ding and Palmer (2005) propose a syntax-based translation model based on a probabilistic synchronous dependency insert grammar, a version of synchronous grammars defined on dependency trees. All these approaches, though different in formalism, make use of synchronous grammars or tree-based transduction rules to model both source and target languages. null Another class of approaches make use of syntactic information in the target language alone, treating the translation problem as a parsing problem. Yamada and Knight (2001) use a parser in the target language to train probabilities on a set of operations that transform a target parse tree into a source string.</Paragraph> <Paragraph position="5"> Paying more attention to source language analysis, Quirk et al. (2005) employ a source language dependency parser, a target language word segmentation component, and an unsupervised word alignment component to learn treelet translations from parallel corpus.</Paragraph> <Paragraph position="6"> In this paper, we propose a statistical translation model based on tree-to-string alignment template which describes the alignment between a source parse tree and a target string. A TAT is capable of generating both terminals and non-terminals and performing reordering at both low and high levels. The model is linguistically syntax-based because TATs are extracted automatically from word-aligned, source side parsed parallel texts.</Paragraph> <Paragraph position="7"> To translate a source sentence, we first employ a parser to produce a source parse tree and then apply TATs to transform the tree into a target string. One advantage of our model is that TATs can be automatically acquired to capture linguistically motivated reordering at both low (word) and high (phrase, clause) levels. In addition, the training of TAT-based model is less computationally expensive than tree-to-tree models. Similarly to (Galley et al., 2004), the tree-to-string alignment templates discussed in this paper are actually transformation rules. The major difference is that we model the syntax of the source language instead of the target side. As a result, the task of our decoder is to find the best target string while Galley's is to seek the most likely target tree.</Paragraph> </Section> class="xml-element"></Paper>