File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-2139_metho.xml

Size: 10,426 bytes

Last Modified: 2025-10-06 14:15:03

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2139">
  <Title>Deriving Transfer Rules from Dominance-Preserving Alignments</Title>
  <Section position="4" start_page="843" end_page="845" type="metho">
    <SectionTitle>
3 The Dominance-Preserving
Algorithm
</SectionTitle>
    <Paragraph position="0"> Let T and T' be the source and the target trees.</Paragraph>
    <Paragraph position="1"> We use a dynamic programming algorithm to compute, in a bottom-up fashion, the scores for matching each node in T against each node in T'.</Paragraph>
    <Paragraph position="2"> There are O(n 2) such scores, n = max(IT\[, IT'\]) is number of nodes in the trees. Let the d(v) be the degree of a node v. We denote children of u by vi, i = 1,..., d(v), and arc (v, v{) by if{.</Paragraph>
    <Paragraph position="3"> For all pairs of nodes v E T and v' E T', the algorithm computes the score function S(v, v').</Paragraph>
    <Paragraph position="4"> S(v, v ~) corresponds to the best match found between the subtrees rooted at v in T and at v ~ in T'. The values of S are stored in a. \[T\[ x IT' I matrix, also denoted by S. \[nitially, we fill the matrix S with undefined values, and invoke the procedure SCOREdom, described below, to compute S(root(T), root(T')), the score for matching the root nodes of the trees. During the computation of the score for the roots, the procedure recursively finds the best-scoring matches for all the nodes in the trees. This yields the best alignment of the entire trees.</Paragraph>
    <Paragraph position="5"> Table l(a) shows the values of S for the trees in Figure 1. Whenever we compute a score fox&amp;quot; internal nodes, we also record the best way of pairing up their children in Table l(b). 3 The  alignment, implicit in these children pairings, is used in a later phase (Section 4) to recover the alignment for the entire trees.</Paragraph>
    <Paragraph position="6"> Procedure SCOREdorn: For a pair of nodes, (v, v~), recursively compute the score S(v, v'): Construct an intermediate child-scoring matrix M = M(v, v'), for the children of v and v~; the dimensions of M are (d(v) + i) x (d(v') + t).</Paragraph>
    <Paragraph position="7"> That is, the number of rows in M is one more than the number of children of v, and the number of columns is one more than number of children of v C/. V~re label row d(v) + 1 and column d(v ~) + 1 with a &amp;quot;*&amp;quot;. Fill the matrix M: 1. Vi, j, where 1 &lt;_ i &lt;_ d(v),t &lt; j &lt;_ d(C/) compute the corresponding entry in Mij: The function Lex,~od~.(v,v ~) &gt;_ 0 (used below) is the quality of translation, i.e. the measure of how closely the label (word) at source node v corresponds to the label at target node v ~ in the bilingual dictionary, and Lex~c( ff, ff~) &gt;__ 0 is the corresponding measure for arc labels.</Paragraph>
    <Paragraph position="8">  2. Fill the last column as follows: Vi, where</Paragraph>
    <Paragraph position="10"> edge ffi, which depends on the value of the label of that edge.</Paragraph>
    <Paragraph position="11">  3. Symmetrically, Vj s.t. t _&lt; j &lt;_ d(v ~) fill the last row with the entries: M.j = S(v, v;) - Pen(~;) 4. The entry M.. is disfavored: ~,'l~. = -~c  For example, during the calculation of the scores S(D, D') and S(E, D') from Table t; the corresponding matrices M(D, D ~) and M(E, D t) are filled in as in Table 2. The proper values for the parameter functions used above, such as the penalty function Pen and the translation ineasures, are chosen empirically, and constitute the tunable parameters of the procedure. Normally, we will expect that the values of Lexr, ode will be much larger than the values of Lex~rc and Pen. In the example we used the following settings:  1. Lexnode = 100 for an exact translation, as for (,4, .4'), (B, B t) and (C, C'), and 0 otherwise. 2. all values of Lex~c are set to zero 3. all penalties Pen are set to 1  Now, using the values in M, compute the score for matching v and C/:</Paragraph>
    <Paragraph position="13"> Here P is a legitimate pairing of v and its children against v' and its children. A legitimate pairing P is a set of elements of the matrix M.</Paragraph>
    <Paragraph position="14">  that conform to the following conditions: 1. each row and each column of M may contribute at most one element to P, except that the row and the column labeled * may contribute more than one element to P 2. if P contains an element Mij correspond null ing to the node pair (w. w'), and some child node u appears in the Children-Pairing for (w, w'), then the row or column of u may not contribute any elements to P.</Paragraph>
    <Paragraph position="15"> We use/.7 ) = PS7)(v. v') to denote the set of all legitimate pairings. There are O(d!) such pairings, where d is the greater of the degrees of u and v'. The summation in (l) ranges over all the pairs (i, j) that appear in a legitimate pairing P E /.7)(v, v'). We evaluate this summation for all O(d!) legitimate pairings in/.7), and then select the pairing Pbe~t with the maximum score. Pbest is then stored in the Children-Pairing matrix entry for (v, v').</Paragraph>
    <Paragraph position="16"> Table 2 shows how scores are calculated. The best score for S(E, D ~) is 200, the sum of the scores for (B,B') and (C,C'). S(D.D') = 299 = S(A, A') + S(E, D') - t, a penalty of t for collapsing the edge from D to E.</Paragraph>
    <Paragraph position="17"> We can reduce the computation time of the max term in (1), if we do not consider all O(d!) pairings of the children of v and v'. Instead of exhaustively computing the maximal-scoring pairing Pbest in (t), we can build it in a greedy fashion: successively choos the d highest-scoring, mutually disjoint pairs from the O(d 2) possible pairs of children of v and v'.</Paragraph>
    <Paragraph position="18">  1. Initialize the set of highest scoring pairs Pb,=~t e- 0 2. Phi.st e- Pbestu{ (i,j) } where Mij is the next largest entry in the matrix, which that satisfies both conditions 1 and 2 of legitimate pairings  3. Repeat the above step until no more pairs can be added to Pbest, at most d times.</Paragraph>
    <Paragraph position="19"> where d = min(d(v), d(vl)).</Paragraph>
    <Paragraph position="20"> 4. Compute the result: S(V, Y') -- LeZnode(V. V') -4:- ~(i,j)ePb,.~, :tiiJ  The greedy algorithm aligns trees with n nodes and maximal degree d in O(n2d 2) time.</Paragraph>
  </Section>
  <Section position="5" start_page="845" end_page="846" type="metho">
    <SectionTitle>
4 Acquiring Transfer Rules
</SectionTitle>
    <Paragraph position="0"> This section describes the procedure for deriving transfer rules from aligned parse trees.</Paragraph>
    <Paragraph position="1"> First, the best-scoring alignment is recovered from the Children-Pairing matrix, (Table t(b)). 4 Start by including the root node-pair in the alignment, (here (D, DI)). Then, for each pair (v, v ~) already in the alignment, repeat the following steps, until no more pairs can be added to the alignment: (t) look up the Children:Pairing for (v.v'); (2) for each pair in the childrenpairing, if it does not include either v or v ~, add the pair to the alignment, (e.g. (A, At), etc.).</Paragraph>
    <Paragraph position="2"> 4When sentences in the bitext have multiple parses, we align structure sharing forests of trees. If one pair of trees has the highest scoring alignment, we acquire transfer rules from that alignment. When more than one pair of trees tie for the highest score, we acquire transfer rules from the set of pairs of aligned subtrees which are shared by each of these high scoring alignments.</Paragraph>
    <Paragraph position="3"> In the running example, the final alignment (FA)is {(D, D'), (A, A'), (B, S'), (C, C')}.</Paragraph>
    <Paragraph position="4"> Based on this alignment we can &amp;quot;chop up&amp;quot; the trees into fragments, or substructures ((Matsumoto et hi., 1993)), where each substructure of a tree is a connected group of nodes in the tree, together with their joining arcs. In Figure i, dashed arrows connect aligned pairs of source and target substructures. These correspondences become our transfer rules.</Paragraph>
    <Paragraph position="5"> For each pair of aligned nodes (v, v') in FA, there is a pair of substructures in Figure t such that v and v ~ are the roots of the source and target substructures. These substructures include all unaligned source and target nodes v~ and ' below v and v', which have no intervening V u aligned nodes y or y' dominating v, or v~u.</Paragraph>
    <Paragraph position="6"> The transfer rules derived from Figure t may be written as follows:  taining a root lexical item, and a set of arcvalue pairs. An arc (role) al with head (value) h is written as al : h, where h is a fixed label (word), a substructure or a variable. If the source substructure has n of the leaves labeled with variables xl, * *., x~, the target will have n of the leaves labeled with Tr(xl),..., Tr(x~), where Tr(x) is the texical translation function.</Paragraph>
    <Paragraph position="7"> This general structure allows us to capture relations between multi-word expressions in the source and target languages.</Paragraph>
  </Section>
  <Section position="6" start_page="846" end_page="846" type="metho">
    <SectionTitle>
5 Translation
</SectionTitle>
    <Paragraph position="0"> The described procedure for acquisition of transfer rules from corpora is the basis for our translation system. A large collection of transfer rules are collected from a training corpus. When new text is to be translated, it is first parsed. The source tree is matched against the left hand sides of the transfer rules which have been collected.</Paragraph>
    <Paragraph position="1"> If a set of transfer rules whose left-hand sides match the parse tree is found, the corresponding target structure is generated from the right hand sides of these transfer rules. Typically, several sets of transfer rules meet this criterion. They are ranked by their frequency in the training corpus. Once a target tree has been produced, it is converted to a word sequence by a target language generator. We have applied this approach to the translation of Microsoft Help files in English and Spanish. The sentences are moderately simple and quite parallel in structure, which has made the corpus suitable for our initial system development. To date, we have been using a training corpus of about 1,000 sentences, and a test corpus of about 100 sentences.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML