File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1098_intro.xml
Size: 5,498 bytes
Last Modified: 2025-10-06 14:03:35
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1098"> <Title>Left-to-Right Target Generation for Hierarchical Phrase-based Translation</Title> <Section position="3" start_page="777" end_page="779" type="intro"> <SectionTitle> 2 Translation Model </SectionTitle> <Paragraph position="0"> A weighted synchronous-CFG is a rewrite system consisting of production rules whose right-hand side is paired (Aho and Ullman, 1969):</Paragraph> <Paragraph position="2"> where X is a non-terminal, g and a are strings of terminals and non-terminals. For notational simplicity, we assume that g and a correspond to the foreign language side and the English side, respectively. [?] is a one-to-one correspondence for the non-terminals appeared in g and a. Starting from an initial non-terminal, each rule rewrites non-terminals in g and a that are associated with [?].</Paragraph> <Paragraph position="3"> Chiang (2005) proposed a hierarchical phrase-based translation model, a binary synchronous-CFG, which restricted the form of production rules as follows: * Only two types of non-terminals allowed: S and X.</Paragraph> <Paragraph position="4"> * Both of the strings g and a must contain at least one terminal item.</Paragraph> <Paragraph position="5"> * Rules may have at most two non-terminals but non-terminals cannot be adjacent for the foreign language side g.</Paragraph> <Paragraph position="6"> The production rules are induced from a bilingual corpus with the help of word alignments. To alleviate a data sparseness problem, glue rules are added that prefer combining hierarchical phrases in a serial manner: where boxed indices indicate non-terminal's linkages represented in[?].</Paragraph> <Paragraph position="7"> Our model is based on Chiang (2005)'s framework, but further restricts the form of production rules so that the aligned right-hand side a follows a GNF-like structure:</Paragraph> <Paragraph position="9"> where -b is a string of terminals, or a phrase, and beta is a (possibly empty) string of nonterminals. The foreign language at right-hand side g still takes an arbitrary string of terminals and non-terminals. The use of a phrase -b as a prefix keeps the strength of the phrase-base framework. A contiguous English side coupled with a (possibly) discontiguous foreign language side preserves a phrase-bounded local word reordering.</Paragraph> <Paragraph position="10"> At the same time, the target-normalized framework still combines phrases hierarchically in a restricted manner.</Paragraph> <Paragraph position="11"> The target-normalized form can be regarded as a type of rule in which certain non-terminals are always instantiated with phrase translation pairs.</Paragraph> <Paragraph position="12"> Thus, we will be able to reduce the number of rules induced from a bilingual corpus, which, in turn, help reducing the decoding complexity.</Paragraph> <Paragraph position="13"> The contiguous phrase-prefixed form generates English in left-to-right order. Therefore, a decoder can easily hypothesize a derivation tree integrated with a ngram language model even with higher order. null Note that we do not imply arbitrary synchronous-CFGs are transformed into the target normalized form. The form simply restricts the grammar extracted from a bilingual corpus explained in the next section.</Paragraph> <Section position="1" start_page="778" end_page="778" type="sub_section"> <SectionTitle> 2.1 Rule Extraction </SectionTitle> <Paragraph position="0"> We present an algorithm to extract production rules from a bilingual corpus. The procedure is based on those for the hierarchical phrase-based translation model (Chiang, 2005).</Paragraph> <Paragraph position="1"> First, a bilingual corpus is annotated with word alignments using the method of Koehn et al.</Paragraph> <Paragraph position="2"> (2003). Many-to-many word alignments are induced by running a one-to-many word alignment model, such as GIZA++ (Och and Ney, 2003), in both directions and by combining the results based on a heuristic (Koehn et al., 2003).</Paragraph> <Paragraph position="3"> Second, phrase translation pairs are extracted from the word alignment corpus (Koehn et al., 2003). The method exhaustively extracts phrase pairs ( f j+mj , ei+ni ) from a sentence pair ( f J1 , eI1) that do not violate the word alignment constraints a: [?](i', j')[?]a : j'[?][ j, j + m], i'[?][i, i + n] [?](i', j')[?]a : j'[?][ j, j + m], i' nelement [i, i + n] [?](i', j')[?]a : j' nelement [ j, j + m], i'[?][i, i + n] Third, based on the extracted phrases, production rules are accumulated by computing the &quot;holes&quot; for contiguous phrases (Chiang, 2005): an English word.</Paragraph> <Paragraph position="4"> * Adjacent non-terminals are not allowed for the foreign language side.</Paragraph> </Section> <Section position="2" start_page="778" end_page="779" type="sub_section"> <SectionTitle> 2.2 Phrase-based Rules </SectionTitle> <Paragraph position="0"> The rule extraction procedure described in Section 2.1 is a corpus-based, therefore will be easily suffered from a data sparseness problem. The hierarchical phrase-based model avoided this problem by introducing the glue rules 5 and 6 that combined hierarchical phrases sequentially (Chiang, 2005).</Paragraph> <Paragraph position="1"> We use a different method of generalizing production rules. When production rules without non-terminals are extracted in step 1 of Section 2.1,</Paragraph> </Section> </Section> class="xml-element"></Paper>