File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/p99-1012_metho.xml
Size: 15,843 bytes
Last Modified: 2025-10-06 14:15:27
<?xml version="1.0" standalone="yes"?> <Paper uid="P99-1012"> <Title>Preserving Semantic Dependencies in Synchronous Tree Adjoining Grammar*</Title> <Section position="5" start_page="89" end_page="91" type="metho"> <SectionTitle> 3 Obtaining Source Dependencies </SectionTitle> <Paragraph position="0"> If we assume that this attachment structure captures a sentence's semantic dependencies, then in order to preserve semantic dependencies in synchronous TAG translation, we will need to obtain this structure from a source derivation and then construct a target derivation with an isomorphic structure.</Paragraph> <Paragraph position="1"> The first algorithm we present obtains semantic dependencies for derivations by keeping track of an additional field in each chart item during parsing, corresponding to the predicate variable from Section 2. Other than the additional field, the algorithm remains essentially the same as the parsing algorithm described in (Schabes and Shieber, 1994), so it can be applied as a transducer during recognition, or as a post-process on a derivation forest (Vijay-Shanker and Weir, 1993). Once the desired dependencies are obtained, the forest may be filtered to select a single most-preferred tree using statistics or rule-based selectional restrictions on those dependencies. 4 For calculating dependencies, we define a function arg(~) to return the argument position associated with a substitution site or foot node ~? in elementary tree V. Let a dependency be defined as a labeled arc (C/, l, ~b), from predicate C/ to predicate C/ with label I.</Paragraph> <Paragraph position="2"> * For each tree selected by C/, set the predicate variable of each anchor item to C/.</Paragraph> <Paragraph position="3"> with predicate variable w into &quot;),C/ at node address U, emit (C/, arg(v , r/), w) * For each modifier adjunction of auxiliary tree/3C/ into tree VC/ with predicate variable X, emit (C/, arg(p, FOOT), X) and set the predicate variable of the composed item to X.</Paragraph> <Paragraph position="4"> * For each predicative adjunction of auxiliary tree /3C/ with predicate variable w into tree &quot;),C/ with predicate variable X, emit (C/, arg(/3, FOOT), X) and set the predicate variable of the composed item to w.</Paragraph> <Paragraph position="5"> * For all other productions, propagate the predicate variable up along the path from the main anchor to the root.</Paragraph> <Paragraph position="6"> Since the number of possible values for the additional predicate variable field is bounded by n, where n is the number of lexical items in the input sentence, and none of the productions combine more than one predicate variable, the complexity of the dependency transducing algorithm is O(nT).</Paragraph> <Paragraph position="7"> This algorithm can be applied to the example derivation tree in Section 1,</Paragraph> <Paragraph position="9"> which resembles the stacked derivation tree for Candito and Kahane's example 5a, &quot;Paul claims Mary said Peter left.&quot; First, we adjoin/32 :is-supposed-to at node VP of/31 :be-able-to, which produces the dependency (is-supposed-to,0,be-able-to}. Then we adjoin ~31:be-able-to at node VP of a:fly, which produces the dependency (be-able-to,0,fly). The resulting dependencies are represented graphi-Cally in the dependency structure below:</Paragraph> <Paragraph position="11"> This example is relatively straightforward, simply reversing the direction of adjunction dependencies as described in (Candito and Kahane, 1998a), but this algorithm can transduce the correct isomorphic dependency structure for the Portuguese derivation as well, similar to the distributed derivation tree in Candito and Kahane's example 5b, &quot;Paul claims Mary seems to adore hot dogs,&quot; (Rambow et al., 1995), where there is no edge corresponding to the dependency between the raising and bridge verbs:</Paragraph> <Paragraph position="13"> We begin by adjoining ~1 :g-capaz-de at node VP of c~:voar, which produces the dependency (~-capaz-de, 0,voar), just as before. Then we adjoin p2:~-pressuposto-que at node S of c~:voar.</Paragraph> <Paragraph position="14"> This time, however, we must observe the predicate variable of the chart item for c~:voar which was updated in the previous adjunction, and now references ~-capaz-de instead of voar. Because the transduction rule for adjunction uses the predicate variable of the parent instead of just the predicate, the dependency produced by the adjunetion of ~2 is (~-pressuposto-que, 0,~capaz-de), yielding the graph: As Candito and Kahane point out, this derivation tree does not match the dependency structure of the sentence as described in Meaning Text Theory (Mel'cuk, 1988), because there is no edge in the derivation corresponding to the dependency between surprise and have-to (the necessity of Paul's staying is what surprises Mary, not his staying in itself). Using the above algorithm, however, we can still produce the desired dependency structure:</Paragraph> <Paragraph position="16"> by adjoining fl:have-to at node VP of c~2:stay to produce a composed item with have-to as its predicate variable, as well as the dependency (have-to, 0,stay/. When a2:stay substitutes at node So of c~l:surprise, the resulting dependency also uses the predicate variable of the argument, yielding (surprise, 0,have-to).</Paragraph> <Paragraph position="18"> The derivation examples above only address the preservation of dependencies through adjunction. Let us now attempt to preserve both substitution and adjunction dependencies in transducing a sentence based on Candito and Kahane's example 5c, &quot;That Paul has to stay surprised Mary,&quot; in order to demonstrate how they interact. 5 We begin with the derivation tree:</Paragraph> <Paragraph position="20"> 5We have replaced want to in the original example with have to in order to highlight the dependency structure and set aside any translation issues related to PRO control.</Paragraph> </Section> <Section position="6" start_page="91" end_page="94" type="metho"> <SectionTitle> 4 Obtaining Target Derivations </SectionTitle> <Paragraph position="0"> Once a source derivation is selected from the parse forest, the predicate-argument dependencies can be read off from the items in the forest that constitute the selected derivation. The resulting dependency graph can then be mapped to a forest of target derivations, where each predicate node in the source dependency graph is linked to a set of possible elementary trees in the target grammar, each of which is instantiated with substitution or adjunction edges leading to other linked sets in the forest. The elementary trees in the target forest are determined by the predicate pairs in the transfer lexicon, and by the elementary trees that can realize the translated targets. The substitution and adjunction edges in the target forest are determined by the argument links in the transfer lexicon, and by the substitution and adjunction configurations that can realize the translated targets' dependencies.</Paragraph> <Paragraph position="1"> Mapping dependencies into substitutions is relatively straightforward, but we have seen in Section 2 that different adjunction configurations (such as the raising and bridge verb ad- null junctions in sentences (1) and (2)) can correspond to the same dependency graph, so we should expect that some dependencies in our target graph may correspond to more than one adjunction configuration in the target derivation tree. Since a dependency may be realized by adjunctions at up to n different sites, an unconstrained algorithm would require exponential time to find a target derivation in the worst case. In order to reduce this complexity, we present a dynamic programming algorithm for constructing a target derivation forest in time proportional to O(n 4) which relies on a restriction that the target derivations must preserve the relative scope ordering of the predicates in the source dependency graph.</Paragraph> <Paragraph position="2"> This restriction carries the linguistic implication that the scope ordering of adjuncts is part of the meaning of a sentence and should not be re-arranged in translation. Since we exploit a notion of locality similar to that of Isomorphic Synchronous TAG, we should not expect the generative power of our definition to exceed the generative power of TAG, as well.</Paragraph> <Paragraph position="3"> First, we define an ordering of predicates on the source dependency graph corresponding to a depth-first traversal of the graph, originating at the predicate variable of the root of the source derivation, and visiting arguments and modifiers in order from lowest to highest scope. In other words, arguments and modifiers will be ordered from the bottom up on the elementary tree structure of the parent, such that the foot node argument of an elementary tree has the lowest scope among the arguments, and the first adjunct on the main (trunk) anchor has the lowest scope among the modifiers.</Paragraph> <Paragraph position="4"> Arguments, which can safely be permuted in translation because their number is finitely bounded, are traversed entirely before the parent; and modifiers, which should not be permuted because they may be arbitrarily numerous, are traversed entirely after the parent.</Paragraph> <Paragraph position="5"> This enumeration will roughly correspond to the scoping order for the adjuncts in the source derivation, while preventing substituted trees from interrupting possible scoping configurations. We can now identify all the descendants of any elementary tree in a derivation because they will form a consecutive series in the enumeration described above. It therefore provides a convenient way to generate a target derivation forest that preserves the scoping information in the source, by 'parsing' the scope-ordered string of elementary trees, using indices on this enumeration instead of on a string yield.</Paragraph> <Paragraph position="6"> It is important to note that in defining this algorithm, we assume that all trees associated with a particular predicate will use the same argument structure as that predicate. 6 We also assume that the set of trees associated with a particular predicate may be filtered by transferring information such as mood and voice from source to target predicates.</Paragraph> <Paragraph position="7"> Apart from the different use of indices, the algorithm we describe is exactly the reverse of the transducer described in Section 3, taking a dependency graph 79 and producing a TAG derivation forest containing exactly the set of derivation trees for which those dependencies hold. Here, as in a parsing algorithm, we define forest items as tuples of (~/C/, 'q, _1_, i,j, X) where a, ~, and 7 are elementary trees with node'O, C/ and C/ are predicates, X and w be predicate variables, and T and _1_ are delimiters tbr opening and closing adjunction, but now let i, j, and k refer to the indices on the scoping enumeration described above, instead of on an input string.</Paragraph> <Paragraph position="8"> In order to reconcile scoping ranges for substitution, we must also define a function first(C) to return the leftmost (lowest) edge of the C/'s range in the scope enumeration, and last(C) to return the rightmost (highest) edge of the C/'s range in the scope enumeration.</Paragraph> <Paragraph position="9"> * For each tree 7 mapped from predicate C/ at scope i, introduce (~,C/, first(C), i + 1, C/}.</Paragraph> <Paragraph position="11"> ~Although this does not hold for certain relative clause elementary trees with wh-extractions as substitutions sites (since the wh-site is an argument of the main verb of the clause instead of the foot node), Candito and Kahane (Candito and Kahane, 1998b) suggest an alternative analysis which can be extended to TAG by adjoining the relative clause into its wh-word as a predicative adjunct, and adjoining the wh-word into the parent noun phrase as a modifier, so the noun phrase is treated as an argument of the wh-word rather than of the relative clause.</Paragraph> <Paragraph position="12"> tion as in the transducer algorithm, propagating index ranges and predicative variables up along the path from the main anchor to the root.</Paragraph> <Paragraph position="13"> Since none of the productions combine more than three indices and one predicate variable, and since the indices and predicate variable may have no more than n distinct values, the algorithm runs in O(n 4) time. Note that one of the indices may be redundant with the predicate variable, so a more efficient implementation might be possible in dO(n3).</Paragraph> <Paragraph position="14"> We can demonstrate this algorithm by translating the English dependency graph from Section 1 into a derivation tree for Portuguese.</Paragraph> <Paragraph position="15"> First, we enumerate the predicates with their relative scoping positions:</Paragraph> <Paragraph position="17"> Then we construct a derivation forest based on the translated elementary trees a:voar,/31 :dcapaz-de, and /32 :d-pressuposto-que. Beginning at the bottom, we assign to these constituents the relative scoping ranges of 1-2, 2-3, and 3-$, respectively, where $ is a terminal symbol.</Paragraph> <Paragraph position="18"> There is also a dependency from is-supposed-to to be-able-to allowing us to adjoin /32:dpressuposto-que to /31:d-capaz-de to make it cover the range from 2 to $, but there would be no S node to host its adjunction, so this possibility can not be added to the forest. We can, however, adjoin/32:d-pressuposto-que to the instance of a:voar extending to/31 :d-capaz-de that covers the range from 1 to 3, resulting in a complete analysis of the entire scope from 1 to $, (from (~:voar to/32:pressuposto) rooted on voar: (O~voar, l,2,..) (/3capaz, 2, 3, ..) (/3press, 3, $, ..) <O~voar ' 1, 3, capaz) <avoar, 1, $, press} which matches the distributed derivation tree where both auxiliary trees adjoin to roar.</Paragraph> <Paragraph position="19"> \[1-$\]a:voar \[2-3\]/31:6-capaz-de(VP) \[3-$\]~2:6-pressup.-que(S) Let us compare this to a translation using the same dependency structure, but different words:</Paragraph> <Paragraph position="21"> Once again we select trees in the target language, and enumerate them with scoping ranges in a pre-order traversal, but this time the construction at scope position 3 must be translated as a raising verb (vai) instead of as a bridge construction (d-pressuposto-que): (avoar, l,2,..> (/3capaz,2,3,..> (/3vai,3,$,..> (avoar, l,2,..) (/3capaz,2,3,..> (/3press, 3,$,..> Since there is a dependency from be-able-to to fly, we can adjoin/31:d-capaz-de to a:voar such that it covers the range of scopes from 1 to 3 (from roar to d-capaz-de), so we add this possibility to the forest.</Paragraph> <Paragraph position="22"> Although we can still adjoin/31 :ser-capaz-de at the VP node of a:voar, we will have nowhere to adjoin /32:vai, since the VP node of a:voar is now occupied, and only one predicative tree may adjoin at any node. 7 (avoar, 1, 2,..) (t3capaz, 2, 3, ..) (/3vai, 3, $, ..) (avoar, 1, 3, capaz> (avoar , l, 2, ..) (/3capaz, 2, 3, -.) (/3;ress, 3,$,..) (avoar, 1, 3, capaz) Fortunately, we can also realize the dependency between vai and ser-capaz-de by adjoining/32 :vai at the VP.</Paragraph> <Paragraph position="23"> <avo r, l, 2, ..) <13capaz, 2, 3, ..) (/3va , 3, $, ..) < capaz, 2, $, vai) The new instance spanning from 2 to $ (from ~1 :capaz to/32 :vai) can then be adjoined at the VP node of roar, to complete the derivation.</Paragraph> <Paragraph position="25"> This corresponds to the stacked derivation, with p2:vai adjoined to t31:ser-capaz-de and</Paragraph> </Section> class="xml-element"></Paper>