File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/j94-1004_metho.xml
Size: 74,573 bytes
Last Modified: 2025-10-06 14:13:53
<?xml version="1.0" standalone="yes"?> <Paper uid="J94-1004"> <Title>An Alternative Conception of Tree-Adjoining Derivation</Title> <Section position="3" start_page="0" end_page="93" type="metho"> <SectionTitle> 2. The Standard Definition of Derivation </SectionTitle> <Paragraph position="0"> To exemplify the distinction between standard and extended derivations, we exhibit the TAG of Figure 1.1 This grammar derives some simple noun phrases such as &quot;roasted red pepper&quot; and &quot;baked red potato.&quot; The former, for instance, is associated with the derived tree in Figure 2(a). The tree can be viewed as being derived in two ways: 2 Dependent: The auxiliary tree,,flro is adjoined at the root node (address C) 3 of fire. The resultant tree is adjoined at the N node (address 1) of initial tree ape.</Paragraph> <Paragraph position="1"> This derivation is depicted as the derivation tree in Figure 3(a).</Paragraph> <Paragraph position="2"> Independent: The auxiliary trees flro and fire are adjoined at the N node (address 1) of the initial tree ape. This derivation is depicted as the derivation tree in Figure 3(b).</Paragraph> <Paragraph position="3"> 1 Here and elsewhere, we conventionally use the Greek letter c~ and its subscripted and primed variants for initial trees, fl and its variants for auxiliary trees, and ~, and its variants for elementary trees in general. The foot node of an auxiliary tree is marked with an asterisk ('*'). 2 We ignore here the possibility of another dependent derivation wherein adjunction occurs at the foot node of an auxiliary tree. Because this introduces yet another systematic ambiguity, it is typically disallowed by stipulation in the literature on linguistic analyses using TAGs. 3 The address of a node in a tree is taken to be its Gorn number, that sequence of integers specifying which branches to traverse in order starting from the root of the tree to reach the node. The address of the root of the tree is therefore the empty sequence, notated C/. See the appendix for a more complete discussion of notation.</Paragraph> <Paragraph position="4"> Derivation trees for the derived tree of Figure 2(a) according to the grammar of Figure 1.</Paragraph> <Paragraph position="5"> Computational Linguistics Volume 20, Number 1 In the independent derivation, two trees are separately adjoined at one and the same node in the initial tree. In the dependent derivation, on the other hand, one auxiliary tree is adjoined to the other, the latter only being adjoined to the initial tree. We will use this informal terminology uniformly in the sequel to distinguish the two general topologies of derivation trees.</Paragraph> <Paragraph position="6"> The standard definition of derivation, as codified by Vijay-Shanker, restricts derivations so that two adjunctions cannot occur at the same node in the same elementary tree. The dependent notion of derivation (Figure 3(a)) is therefore the only sanctioned derivation for the desired tree in Figure 2(a); the independent derivation (Figure 3(b)) is disallowed. Vijay-Shanker's definition is appropriate because for any independent derivation, there is a dependent derivation of the same derived tree. This can be easily seen in that any adjunction of f12 at a node at which an adjunction of fll occurs could instead be replaced by an adjunction of f12 at the root of ill.</Paragraph> <Paragraph position="7"> The advantage of this standard definition of derivation is that a derivation tree in this normal form unambiguously specifies a derived tree. The independent derivation tree, on the other hand, is ambiguous as to the derived tree it specifies in that a notion of precedence of the adjunctions at the same node is unspecified, but crucial to the derived tree specified. This follows from the fact that the independent derivation tree is symmetric with respect to the roles of the two auxiliary trees (by inspection), whereas the derived tree is not. By symmetry, therefore, it must be the case that the same independent derivation tree specifies the alternative derived tree in Figure 2(b). 3. Motivation for an Extended Definition of Derivation In the absence of some further interpretation of the derivation tree nothing hinges on the choice of derivation definition, so that the standard definition disallowing independent derivations is as reasonable as any other. However, tree-adjoining grammars are almost universally extended with augmentations that make the issue apposite.</Paragraph> <Paragraph position="8"> We discuss three such variations here, all of which argue for the use of independent derivations under certain circumstances. 4</Paragraph> <Section position="1" start_page="93" end_page="93" type="sub_section"> <SectionTitle> 3.1 Adding Adjoining Constraints </SectionTitle> <Paragraph position="0"> Already in very early work on tree-adjoining grammars (Joshi, Levy, and Takahashi 1975) constraints were allowed to be specified as to whether a particular auxiliary tree may or may not be adjoined at a particular node in a particular tree. The idea is formulated in its modern variant as selective-adjoining constraints (Vijay-Shanker and Joshi 1985). As an application of this capability, we consider the traditional grammatical view that directional adjuncts can be used only with certain verbs. 5 This would account</Paragraph> </Section> </Section> <Section position="4" start_page="93" end_page="94" type="metho"> <SectionTitle> 4 The formulation of derivation for tree-adjoining grammars is also of significance for other grammatical </SectionTitle> <Paragraph position="0"> formalisms based on weaker forms of adjunction such as lexicalized context-free grammar (Schabes and Waters 1993a) and its stochastic extension (Schabes and Waters 1993b), though we do not discuss these arguments here.</Paragraph> <Paragraph position="1"> 5 For instance, Quirk, Greenbaum, Leech, and Svartvik (1985, page 517) remark that &quot;direction adjuncts of both goal and source can normally be used only with verbs of motion.&quot; Although the restriction is undoubtedly a semantic one, we will examine the modeling of it in a TAG deriving syntactic trees for two reasons. First, the problematic nature of independent derivation is more easily seen in this way. Second, much of the intuition behind TAG analyses is based on a tight relationship between syntactic and semantic structure. Thus, whatever scheme for semantics is to be used with TAGs will require appropriate derivations to model these data. For example, an analysis of this phenomenon by adjoining constraints on the semantic half of a synchronous TAG would be subject to the identical argument. See Section 3.3.</Paragraph> <Paragraph position="2"> Yves Schabes and Stuart M. Shieber Tree-Adjoining Derivation for the felicity distinctions between the following sentences: . a.</Paragraph> <Paragraph position="3"> b.</Paragraph> <Paragraph position="4"> Brockway walked his Labrador towards the yacht club.</Paragraph> <Paragraph position="5"> # Brockway resembled his Labrador towards the yacht club.</Paragraph> <Paragraph position="6"> This could be modeled by disallowing through selective adjoining constraints the adjunction of the elementary tree corresponding to a towards adverbial at the VP node of the elementary tree corresponding to the verb resembles. 6 However, the restriction applies even with intervening (and otherwise acceptable) adverbials.</Paragraph> <Paragraph position="7"> . a.</Paragraph> <Paragraph position="8"> b.</Paragraph> <Paragraph position="9"> 3. a.</Paragraph> <Paragraph position="10"> b.</Paragraph> <Paragraph position="11"> Brockway walked his Labrador yesterday.</Paragraph> <Paragraph position="12"> Brockway walked his Labrador yesterday towards the yacht club. Brockway resembled his Labrador yesterday.</Paragraph> <Paragraph position="13"> # Brockway resembled his Labrador yesterday towards the yacht club. Under the standard definition of derivation, there is no direct adjunction in the latter sentence of the towards tree into the resembles tree. Rather, it is dependently adjoined at the root of the elementary tree that heads the adverbial yesterday, the latter directly adjoining into the main verb tree. To restrict both of the ill-formed sentences, then, a restriction must be placed not only on adjoining the goal adverbial in a resembles context, but also in the yesterday adverbial context. But this constraint is too strong, as it disallows sentence (2b) above as well.</Paragraph> <Paragraph position="14"> The problem is that the standard derivation does not correctly reflect the syntactic relation between the adverbial modifier and the phrase it modifies when there are multiple modifications in a single clause. In such a case, each of the adverbials independently modifies the verb, and this should be reflected in their independent adjunction at the same point. But this is specifically disallowed in a standard derivation. null Another example along the same lines follows from the requirement that tense as manifested in a verb group be consistent with temporal adjuncts. For instance, consider the following examples: 4. a. Brockway b. # Brockway 5. a. # Brockway b. Brockway walked his Labrador yesterday.</Paragraph> <Paragraph position="15"> will walk his Labrador yesterday.</Paragraph> <Paragraph position="16"> walked his Labrador tomorrow.</Paragraph> <Paragraph position="17"> will walk his Labrador tomorrow.</Paragraph> <Paragraph position="18"> Again, the relationship is independent of other intervening adjuncts. 6. a. Brockway b. # Brockway 7. a. # Brockway b. Brockway walked his Labrador towards the yacht club yesterday.</Paragraph> <Paragraph position="19"> will walk his Labrador towards the yacht club yesterday. walked his Labrador towards the yacht club tomorrow.</Paragraph> <Paragraph position="20"> will walk his Labrador towards the yacht club tomorrow.</Paragraph> <Paragraph position="21"> It is important to note that these arguments apply specifically to auxiliary trees that correspond to a modification relationship. Auxiliary trees are used in TAG typically</Paragraph> </Section> <Section position="5" start_page="94" end_page="98" type="metho"> <SectionTitle> 6 Whether the adjunction occurs at the VP node or the S node is immaterial to the argument. </SectionTitle> <Paragraph position="0"> Computational Linguistics Volume 20, Number 1 for predication relations as well, 7 as in the case of raising and sentential complement constructions, s Consider the following sentences. (The brackets mark the leaves of the pertinent trees to be combined by adjunction in the assumed analysis.) . a.</Paragraph> <Paragraph position="1"> b.</Paragraph> <Paragraph position="2"> 9. a.</Paragraph> <Paragraph position="3"> b.</Paragraph> <Paragraph position="4"> 10. a.</Paragraph> <Paragraph position="5"> b.</Paragraph> <Paragraph position="6"> 11. a.</Paragraph> <Paragraph position="7"> b.</Paragraph> <Paragraph position="8"> Brockway assumed that Harrison wanted to walk his Labrador. \[Brockway assumed that\] \[Harrison wanted\] \[to walk his Labrador\] Brockway wanted to try to walk his Labrador. \[Brockway wanted\] \[to try\] \[to walk his Labrador\] Harrison wanted Brockway tried to walk his Labrador. \[Harrison wanted\] \[Brockway tried\] \[to walk his Labrador\] Harrison wanted to assume that Brockway walked his Labrador. \[Harrison wanted\] \[to assume that\] \[Brockway walked his Labrador\] Assume (following, for instance, the analysis of Kroch and Joshi \[1985\]) that the trees associated with the various forms of the verbs try, want, and assume all take sentential complements, certain of which are tensed with overt subjects and others untensed with empty subjects. The auxiliary trees for these verbs specify by adjoining constraints which type of sentential complement they take: assume requires tensed complements, want and try untensed. Under this analysis the auxiliary trees must not be allowed to independently adjoin at the same node. For instance, if trees corresponding to &quot;Harrison wanted&quot; and &quot;Brockway tried&quot; (which both require untensed complements) were both adjoined at the root of the tree for &quot;to walk his Labrador,&quot; the selective adjoining constraints would be satisfied, yet the generated sentence (10a) is ungrammatical. Conversely, under independent adjunction, sentence (11a) would be deemed ungrammatical, although it is in fact grammatical. Thus, the case of predicative trees is entirely unlike that of modifier trees. Here, the standard notion of derivation is exactly what is needed as far as interpretation of adjoining constraints is concerned. An alternative would be to modify the way in which adjoining constraints are updated upon adjunction. If after adjoining a modifier tree at a node, the adjoining constraints of the original node, rather than those of the root and foot of the modifier tree, are manifest in the corresponding nodes in the derived tree, the adjoining constraints would propagate appropriately to handle the examples above. This alternative leads, however, to a formalism for which derivation trees are no longer context-free, with concomitant difficulties in designing parsing algorithms. Instead, the extended definition of derivation effectively allows use of a Kleene-* in the &quot;grammar&quot; of derivation trees.</Paragraph> <Paragraph position="9"> Adjoining constraints can also be implemented using feature structure equations (Vijay-Shanker and Joshi 1988). It is possible that judicious use of such techniques might prevent the particular problems noted here. Such an encoding of a solution requires consideration of constraints that pass among many trees just to limit the co-occurrence of a pair of trees. However, it more closely follows the spirit of TAGs to state such intuitively local limitations locally.</Paragraph> <Paragraph position="10"> 7 We use the term 'predication' in its logical sense, that is, for auxiliary trees that serve as logical predicates over the trees into which they adjoin, in contrast to the term's linguistic sub-sense in which the argument of the predicate is a linguistic subject.</Paragraph> <Paragraph position="11"> 8 The distinction between predicative and modifier trees has been proposed previously for purely linguistic reasons by Kroch (1989), who refers to them as complement and athematic trees, respectively. The arguments presented here can be seen as providing further evidence for differentiating the two kinds of auxiliary trees. A precursor to this idea can perhaps be seen in the distinction between repeatable and nonrepeatable adjunction in the formalism of string adjunct grammars, a precursor of TAGs (Joshi, Kosaraju, and Yamada 1972b, pages 253-254).</Paragraph> <Paragraph position="12"> Yves Schabes and Stuart M. Shieber Tree-Adjoining Derivation In summary, the interpretation of adjoining constraints in TAG is sensitive to the particular notion of derivation that is used. Therefore, it can be used as a litmus test for an appropriate definition of derivation. As such, it argues for a nonstandard independent notion of derivation for modifier auxiliary trees and a standard dependent notion for predicative trees.</Paragraph> <Section position="1" start_page="96" end_page="96" type="sub_section"> <SectionTitle> 3.2 Adding Statistical Parameters </SectionTitle> <Paragraph position="0"> In a similar vein, the statistical parameters of a stochastic lexicalized TAG (SLTAG) (Resnik 1992; Schabes 1992) specify the probability of adjunction of a given auxiliary tree at a specific node in another tree. This specification may again be interpreted with regard to differing derivations, obviously with differing impact on the resulting probabilities assigned to derivation trees. (In the extreme case, a constraint prohibiting adjoining corresponds to a zero probability in an SLTAG. The relation to the argument in the previous section follows thereby.) Consider a case in which linguistic modification of noun phrases by adjectives is modeled by adjunction of a modifying tree.</Paragraph> <Paragraph position="1"> Under the standard definition of derivation, multiple modifications of a single NP would lead to dependent adjunctions in which a first modifier adjoins at the root of a second. As an example, we consider again .the grammar given in Figure 1, which admits of derivations for the strings &quot;baked red potato&quot; and &quot;baked red pepper.&quot; Specifying adjunction probabilities on standard derivations, the distinction between the overall probabilities for these two strings depends solely on the adjunction probabilities of fire (the tree for red) into ~po and c~pC/ (those for potato and pepper, respectively), as the tree fib for the word baked is adjoined in both cases at the root of fire in both standard derivations. In the extended derivations, on the other hand, both modifying trees are adjoined independently into the noun trees. Thus, the overall probabilities are determined as well by the probabilities of adjunction of the trees for baked into the nominal trees. It seems intuitively plausible that the most important relationships to characterize statistically are those between modifier and modified, rather than between two modifiers. 9 In the case at hand, the fact that one typically refers to the process of cooking potatoes as &quot;baking,&quot; whereas the appropriate term for the corresponding cooking process applied to peppers is &quot;roasting,&quot; would be more determining of the expected overall probabilities.</Paragraph> <Paragraph position="2"> Note again that the distinction between modifier and predicative trees is important.</Paragraph> <Paragraph position="3"> The standard definition of derivation is entirely appropriate for adjunction probabilities for predicative trees, but not for modifier trees.</Paragraph> </Section> <Section position="2" start_page="96" end_page="97" type="sub_section"> <SectionTitle> 3.3 Adding Semantics </SectionTitle> <Paragraph position="0"> Finally, the formation of synchronous TAGs has been proposed to allow use of TAGs in semantic interpretation, natural language generation, and machine translation. In previous work (Shieber and Schabes 1990), the definition of synchronous TAG derivation is given in a manner that requires multiple adjunctions at a single node. The need for such derivations follows from the fact that synchronous derivations are intended to model semantic relationships. In cases of multiple adjunction of modifier trees at 9 Intuition is an appropriate guide in the design of the SLTAG framework, as the idea is to set up a linguistically plausible infrastructure on top of which a lexically based statistical model can be built. In addition, suggestive (though certainly not conclusive) evidence along these lines can be gleaned from corpora analyses. For instance, in a simple experiment in which medium frequency triples of exactly the discussed form &quot;(adjective) (adjective) (noun)&quot; were examined, the mean mutual information between the first adjective and the noun was found to be larger than that between the two adjectives.</Paragraph> <Paragraph position="1"> The statistical assumptions behind this particular experiment do not allow very robust conclusions to be drawn, and more work is needed along these lines.</Paragraph> <Paragraph position="2"> Computational Linguistics Volume 20, Number 1 a single node, the appropriate semantic relationships comprise separate modifications rather than cascaded ones, and this is reflected in the definition of synchronous TAG derivation. 1deg Because of this, a parser for synchronous TAGs must recover, at least implicitly, the extended derivations of TAG-derived trees. Shieber (in press) provides a more complete discussion of the relationship between synchronous TAGs and the extended definition of derivation with special emphasis on the ramifications for formal expressivity.</Paragraph> <Paragraph position="3"> Note that the independence of the adjunction of modifiers in the syntax does not imply that semantically there is no precedence or scoping relation between them. As exemplified in Figure 5, the derived tree generated by multiple independent adjunctions at a single node still manifests nesting relationships among the adjoined trees. This fact may be used to advantage in the semantic half of a synchronous tree-adjoining grammar to specify the semantic distinction between, for example, the following two sentences: u 12. a.</Paragraph> <Paragraph position="4"> b.</Paragraph> <Paragraph position="5"> Brockway ran over his polo mallet twice intentionally.</Paragraph> <Paragraph position="6"> Brockway ran over his polo mallet intentionally twice.</Paragraph> <Paragraph position="7"> We hope to address this issue in greater detail in future work on synchronous tree-adjoining grammars.</Paragraph> </Section> <Section position="3" start_page="97" end_page="98" type="sub_section"> <SectionTitle> 3.4 Desired Properties of Extended Derivations </SectionTitle> <Paragraph position="0"> We have presented several arguments that the standard notion of derivation does not allow for an appropriate specification of dependencies to be captured. An extended notion of derivation is needed that differentiates predicative and modifier auxiliary trees; requires dependent derivations for predicative trees; allows independent derivations for modifier trees; and unambiguously and nonredundantly specifies a derived tree. Furthermore, following from considerations of the role of modifier trees in a grammar as essentially optional and freely applicable elements, we would like the following criterion to hold of extended derivations: . If a node can be modified at all, it can be modified any number of times, including zero times.</Paragraph> <Paragraph position="1"> Recall that a derivation tree (as traditionally conceived) is a tree with unordered arcs where each node is labeled by an elementary tree of a TAG and each arc is labeled by a tree address specifying a node in the parent tree. In a standard derivation tree no two sibling arcs can be labeled with the same address. In an extended derivation tree, however, the condition is relaxed: No two sibling arcs to predicative trees can be 10 The importance of the distinction between predicative and modifier trees with respect to how derivations are defined was not appreciated in the earlier work; derivations were taken to be of the independent variety in all cases. In future work, we plan to remedy this flaw. 11 We are indebted to an anonymous reviewer of an earlier version of this paper for raising this issue crisply through examples similar to those given here.</Paragraph> <Paragraph position="2"> Yves Schabes and Stuart M. Shieber Tree-Adjoining Derivation labeled with the same address. Thus, for any given address there can be at most one predicative tree and several modifier trees adjoined at that node. As we have seen, this relaxed definition violates the fourth desideratum above; for instance, the derivation tree in Figure 3(b) ambiguously specifies both derived trees in Figure 2. In the next section we provide a formal definition of extended derivations that satisfies all of the criteria above.</Paragraph> </Section> </Section> <Section position="6" start_page="98" end_page="107" type="metho"> <SectionTitle> 4. Formal Definition of Extended Derivations </SectionTitle> <Paragraph position="0"> In this section we introduce a new framework for describing TAG derivation trees that allows for a natural expression of both standard and extended derivations, and makes available even more fine-grained restrictions on derivation trees. First, we define ordered derivation trees and show that they unambiguously but redundantly specify derivations. 12 We characterize the redundant trees as those related by a sibling swapping operation. Derivation trees proper are then taken to be the equivalence classes of ordered derivation trees in which the equivalence relation is generated by the sibling swapping. By limiting the underlying set of ordered derivation trees in various ways, Vijay-Shanker's definition of derivation tree, a precise form of the extended definition, and many other definitions of derivation can be characterized in this way.</Paragraph> <Section position="1" start_page="98" end_page="99" type="sub_section"> <SectionTitle> 4.1 Ordered Derivation Trees </SectionTitle> <Paragraph position="0"> Ordered derivation trees, like the traditional derivation trees, are trees with&quot; nodes labeled by elementary trees where each arc is labeled with an address in the tree for the parent node of the arc. However, the arcs are taken to be ordered with respect to each other.</Paragraph> <Paragraph position="1"> An ordered derivation tree is well-formed if for each of its arcs, linking parent node labeled 3` to child node labeled 3`~ and itself labeled with address t, the tree 3&quot; is an auxiliary tree that can be adjoined at the node t in the tree 3'. (Alternatively, if substitution is allowed, 3&quot;~ may be an initial tree that can be substituted at the node t in 3`. Later definitions ignore this possibility, but are easily generalized.) We define the function/~ from ordered derivation trees to the derived trees they specify, according to the following recursive definition:</Paragraph> <Paragraph position="3"> if D is a tree with root node labeled with the elementary tree 3` and with k child subtrees D1,..., Dk whose arcs are labeled with addresses tl,..., tk.</Paragraph> <Paragraph position="4"> Here 3`\[A1/h,...,Ak/tk\] specifies the simultaneous adjunction of trees A1 through Ak at tl through tk, respectively, in 3'. It is defined as the iterative adjunction of the Ai in order at their respective addresses, with appropriate updating of the tree addresses of any later adjunction to reflect the effect of earlier adjunctions that occur at addresses dominating the address of the later adjunction.</Paragraph> <Paragraph position="5"> 12 Historical precedent for independent derivation and the associated ordered derivation trees can be found in the derivation trees postulated for string adjunct grammars (Joshi, Kosaraju, and Yamada 1972a, 99-100). In this system, siblings in derivation trees are viewed as totally, not partially, ordered. The systematic ambiguity introduced thereby is eliminated by stipulating that the sibling order be consistent with an arbitrary ordering on adjunction sites.</Paragraph> <Paragraph position="6"> Computational Linguistics Volume 20, Number 1</Paragraph> </Section> <Section position="2" start_page="99" end_page="99" type="sub_section"> <SectionTitle> 4.2 Derivation Trees </SectionTitle> <Paragraph position="0"> It is easy to see that the derived tree specified by a given ordered derivation tree is unchanged if adjacent siblings whose arcs are labeled with different tree addresses are swapped. (This is not true of adjacent siblings whose arcs are labeled with the same address.) That is, if t ~ t' then 3,\[... ,Aft, B/t',...\] = 7\[..., B/t', Aft,...\]. A graphical &quot;proof&quot; of this intuitive fact is given in Figure 4. A formal proof, although tedious and unenlightening, is possible as well. We provide it in an appendix, primarily because the definitional aspects of the TAG formulation may be of some interest.</Paragraph> <Paragraph position="1"> This fact about the swapping of adjacent siblings shows that ordered derivation trees possess an inherent redundancy. The order of adjacent sibling subtrees labeled with different tree addresses is immaterial. Consequently, we can define true derivation trees to be the equivalence classes of the base set of ordered derivation trees under the equivalence relation generated by the sibling subtree swapping operation above. This is a well-formed definition by virtue of the proposition argued informally above.</Paragraph> <Paragraph position="2"> This definition generalizes the traditional definition in not restricting the tree address labels in any way. It therefore satisfies criterion (3) of Section 3.4. Furthermore, by virtue of the explicit quotient with respect to sibling swapping, a derivation tree under this definition unambiguously and nonredundantly specifies a derived tree (criterion 4). It does not, however, differentiate predicative from modifier trees (criterion (1)), nor can it therefore mandate dependent derivations for predicative trees (criterion (2)).</Paragraph> <Paragraph position="3"> This general approach can, however, be specialized to correspond to several previous definitions of derivation tree. For instance, if we further restrict the base set of ordered derivation trees so that no two siblings are labeled with the same tree address, then the equivalence relation over these ordered derivation trees allows for full reordering of all siblings. Clearly, these equivalence classes are isomorphic to the unordered trees, and we have reconstructed Vijay-Shanker's standard definition of derivation tree.</Paragraph> <Paragraph position="4"> If we instead restrict ordered derivation trees so that no two siblings corresponding to predicative trees are labeled with the same tree address, then we have reconstructed a version of the extended definition argued for in this paper. Under this restriction, criteria (1) and (2) are satisfied, while maintaining (3) and (4).</Paragraph> <Paragraph position="5"> By careful selection of other constraints on the base set, other linguistic restrictions might be imposed on derivation trees, still using the same definition of derivation trees as equivalence classes over ordered derivation trees. In the next section, we show that the definition of the previous paragraph should be further restricted to disallow the reordering of predicative and modifier trees. We also describe other potential linguistic applications of the ability to finely control the notion of derivation through the use of ordered derivation trees.</Paragraph> </Section> <Section position="3" start_page="99" end_page="107" type="sub_section"> <SectionTitle> 4.3 Further Restrictions on Extended Derivations </SectionTitle> <Paragraph position="0"> The extended definition of derivation tree given in the previous section effectively specifies the output derived tree by adding a partial ordering on sibling arcs that correspond to modifier trees adjoined at the same address. All other arcs are effectively unordered (in the sense that all relative orderings of them exist in the equivalence class).</Paragraph> <Paragraph position="1"> Assume that in a given tree ~, at a particular address t, the k modifier trees #1,..., ~k are directly adjoined in that order. Associated with the subtrees rooted at the k elementary auxiliary trees in this derivation are k derived auxiliary trees (A1,...,Ak, respectively). The derived tree specified by this derivation tree, according to the definition of ~ given above, would have the derived tree A1 directly below A2 and so forth, with Ak at the top. Now suppose that in addition, a predicative tree 7r is also A graphical proof of the irrelevance of adjacent sibling swapping.</Paragraph> <Paragraph position="2"> These diagrams show the effect of performing two adjunctions (of auxiliary trees depicted, one as dark-shaded and one light-shaded), presumed to be specified by adjacent siblings in an ordered derivation tree. The adjunctions are to occur at two addresses (referred to in this caption as t and t', respectively). The two addresses must be such that either (a) they are distinct but neither dominates the other, (b) t dominates t' (or vice versa), or (c) they are identical. In case (a) the diagram shows that either order of adjunction yields the same derived tree. Adjunction at t and then t' corresponds to the upper arrows, adjunction at t' and then t the lower arrows. Similarly, in case (b), adjunction at t followed by adjunction at an appropriately updated t' yields the same result as adjunction first at t' and then at t. Clearly, adjunctions occurring before these two or after do not affect the interchangeability. Thus, if two adjacent siblings in a derivation tree specify adjunctions at distinct addresses t and t', the adjunctions can occur in either order. Diagram (c) demonstrates that this is not the case when t and t' are the same.</Paragraph> <Paragraph position="3"> Schematic extended derivation tree and associated derived tree.</Paragraph> <Paragraph position="4"> In a derived tree, the predicative tree adjoined at an address t is required to follow all modifier trees adjoined at the same address, as in (a). The derived tree therefore appears as depicted in (b) with the predicative tree outermost.</Paragraph> <Paragraph position="5"> adjoined at address t. It must be ordered with respect to the #i in the derivation tree, and its relative order determines where in the bottom-to-top order in the derived tree the tree A,~ associated with the subderivation rooted at 7r goes.</Paragraph> <Paragraph position="6"> The question that we raise here is whether all k + 1 possible placements of the tree ~r relative to the #i are linguistically reasonable. We might allow all k + 1 orderings (as in the definition of the previous section), or we might restrict them by requiring, say, that the predicative tree always be adjoined before, or perhaps after, any modifier trees at a given address. We emphasize that this is a linguistic question, in the sense that the definition of extended derivation is well formed whatever decision is made on this question.</Paragraph> <Paragraph position="7"> Henceforth, we will assume that predicative trees are always adjoined after any modifier trees at the same address, so that they appear above the modifier trees in the derived tree. We call this &quot;outermost predication&quot; because a predicative tree appears wrapped around the outside of the modifier trees adjoined at the same address. (See Figure 5.) If we were to mandate innermost predication, in which a predicative tree is always adjoined before the modifier trees at the same address, the predicative tree would appear within all of the modifier trees, innermost in the derived tree.</Paragraph> <Paragraph position="8"> Linguistically, the outermost method specifies that if both a predicative tree and a modifier tree are adjoined at a single node, then the predicative tree attaches higher than the modifier tree; in terms of the derived tree, it is as if the predicative tree were adjoined at the root of the modifier tree. This accords with the semantic intuition that in such a case (for English at least), the modifier is modifying the original tree, not the predicative one. (The alternate &quot;reading,&quot; in which the modifier modifies the predicative tree, is still obtainable under an outermost-predication standard by having the modifier auxiliary tree adjoin dependently at the root node of the predicative tree.) Yves Schabes and Stuart M. Shieber Tree-Adjoining Derivation In contrast, the innermost-predication method specifies that the modifier tree attaches higher, as if the modifier tree adjoined at the root of the predicative tree and was therefore modifying the predicative tree, contra semantic intuitions.</Paragraph> <Paragraph position="9"> For this reason, we specify that outermost predication is mandated. This is easily done by further limiting the base set of ordered derivation trees to those in which predicative trees are ordered after modifier tree siblings.</Paragraph> <Paragraph position="10"> (From a technical standpoint, by the way, the outermost-predication method has the advantage that it requires no changes to the parsing rules to be presented later, but only a single addition. The innermost-predication method induces some subtle interactions between the original parsing rules and the additional one, necessitating a much more complicated set of modifications to the original algorithm. In fact, the complexities in generating such an algorithm constituted the precipitating factor that led us to revise our original innermost-predication attempt at redefining tree-adjoining derivation. The linguistic argument, although commanding, became clear to us only later.) Another possibility, which we mention but do not pursue here, is to allow for language-particular precedence constraints to restrict the possible orderings of derivation-tree siblings, in a manner similar to the linear precedence constraints of ID/LP format (Gazdar, Klein, Pullum, and Sag 1985) but at the level of derivation trees.</Paragraph> <Paragraph position="11"> These might be interpreted as hard constraints or soft orderings depending on the application. This more fine-grained approach to the issue of ordering has several applications. Soft orderings might be used to account for ordering preferences among modifiers, such as the default ordering of English adjectives that accounts for the typical preference for &quot;a large red ball&quot; over &quot;? a red large ball&quot; and the typical ordering of temporal before spatial adverbial phrases in German.</Paragraph> <Paragraph position="12"> Similarly, hard constraints might allow for the handling of an apparent counter-example to the outermost-predication rule. 13 One natural analysis of the sentence 13. At what time did Brockway say Harrison arrived? would involve adjunction of a predicative tree for the phrase &quot;did Brockway say&quot; at the root of the tree for &quot;Harrison arrived.&quot; A Wh modifier tree &quot;at what time&quot; must be adjoined in as well. The example question is ambiguous, of course, as to whether it questions the time of the saying or of the arriving. In the former case, the modifier tree presumably adjoins at the root of the predicative tree for &quot;did Brockway say&quot; that it modifies. In the latter case, which is of primary interest here, it must adjoin at the root of the tree for &quot;Harrison arrived.&quot; Thus, both trees would be adjoined at the same address, and the outermost-predication rule would predict the derived sentence to be &quot;Did Brockway say at what time Harrison arrived.&quot; To get around this problem, we might specify hard ordering constraints for English that place all Wh modifier trees after all predicative trees, which in turn come after all non-Wh modifier trees. This would place the Wh modifier outermost as required.</Paragraph> <Paragraph position="13"> Although we find this extra flexibility to be an attractive aspect of this approach, we stay with the more stringent outermost-predication restriction in the material that follows.</Paragraph> <Paragraph position="14"> 13 Other solutions are possible that do not require extended derivations or linear precedence constraints. For instance, we might postulate an elementary tree for the verb arrived that includes a substitution node for a fronted adverbial Wh phrase.</Paragraph> <Paragraph position="15"> Computational Linguistics Volume 20, Number 1 5. Compilation of TAGs to Linear Indexed Grammars In this section we present a technique for compiling tree-adjoining grammars into linear indexed grammars such that the linear indexed grammar makes explicit the extended derivations of the TAG. This compilation plays two roles. First, it provides for a simple proof of the generative equivalence of TAGs under the standard and extended definitions of derivation, as described at the end of this section. Second, it can be used as the basis for a parsing algorithm that recovers the extended derivations for strings. The design of such an algorithm is the topic of Section 6.</Paragraph> <Paragraph position="16"> Linear indexed grammars (LIG) constitute a grammatical framework based, like context-free, context-sensitive, and unrestricted rewriting systems, on rewriting strings of nonterminal and terminal symbols. Unlike these systems, linear indexed grammars, like the indexed grammars from which they are restricted, allow stacks of marker symbols, called indices, to be associated with the nonterminal symbols being rewritten. The linear version of the formalism allows the full index information from the parent to be used to specify the index information for only one of the child constituents.</Paragraph> <Paragraph position="17"> Thus, a linear indexed production can be given schematically as:</Paragraph> <Paragraph position="19"> The Ni are nonterminals, the fli. strings of indices. The &quot;..&quot; notation stands for the remainder of the stack below the given string of indices. Note that only one element on the right-hand side, Ns, inherits the remainder of the stack from the parent. (This schematic rule is intended to be indicative, not definitive. We ignore issues such as the optionality of the inherited stack how terminal symbols fit in, and so forth. Vijay-Shanker and Weir \[1990\] present a complete discussion.) Vijay-Shanker and Weir (1990) present a way of specifying any TAG as a linear indexed grammar. The LIG version makes explicit the standard notion of derivation being presumed. Also, the LIG version of a TAG grammar can be used for recognition and parsing. Because the LIG formalism is based on augmented rewriting, the parsing algorithms can be much simpler to understand and easier to modify, and no loss of generality is incurred. For these reasons, we use the technique in this work.</Paragraph> <Paragraph position="20"> The compilation process that manifests the standard definition of derivation can be most easily understood by viewing nodes in a TAG elementary tree as having both a top and bottom component, identically marked for nonterminal category, that dominate (but may not immediately dominate) each other. (See Figure 6.) The rewrite rules of the corresponding linear indexed grammar capture the immediate domination between a bottom node and its child top nodes directly, and capture the domination between top and bottom parts of the same node by optionally allowing rewriting from the top of a node to an appropriate auxiliary tree, and from the foot of the auxiliary tree back to the bottom of the node. The index stack keeps track of the nodes on which adjunction has occurred so that the recognition to the left and the right of the foot node will occur under identical assumption of derivation structure.</Paragraph> <Paragraph position="21"> The TAG grammar is encoded as a LIG with two nonterminal symbols t and b corresponding to the top and bottom components, respectively, of each node. The stack indices correspond to the individual nodes of the elementary trees of the TAG grammar. Thus, there are as many stack index symbols as there are nodes in the elementary trees of the grammar, and each such index (i.e., node) corresponds unambiguously to a single address in a single elementary tree. (In fact, the symbols can be thought of as pairs of an elementary tree identifier and an address within that tree, and our implementation encodes them in just that way.) The index at the top of the stack corresponds A stack of indices \[717273\] captures the adjunction history that led to the reaching of the node 73 in the parsing process.</Paragraph> <Paragraph position="22"> Parsing of an elementary tree c~ proceeded to node 71 in that tree, at which point adjunction of the tree containing 72 was pursued by the parser. When the node 72 was reached, the tree containing 73 was implicitly adjoined. Once this latter tree is completely parsed, the remainder of the tree containing 72 can be parsed from that point, and so on. to the node being rewritten. Thus, a LIG nonterminal with stack t\[~\] corresponds to the top component of node 7, and b\[~\]1712~\]3\] corresponds to the bottom component of 73- The indices ~h and 7/2 capture the history of adjunctions that are pending completion of the tree in which 73 is a node. Figure 7 depicts the interpretation of a stack of indices.</Paragraph> <Paragraph position="23"> In summary, given a tree-adjoining grammar, the following LIG rules are generated: null . Immediate domination dominating foot: For each auxiliary tree node 7 that dominates the foot node, with children 71,..., 7s,..., ~n, where 7\]s is the child that also dominates the foot node, include a production</Paragraph> <Paragraph position="25"/> <Paragraph position="27"> Start root ofadjunction: For each elementary tree node ~ on which the auxiliary tree fl with root node ~r can be adjoined, include the following production: t\[..,\] --* t\[..,,r\].</Paragraph> <Paragraph position="28"> Start foot ofadjunction: For each elementary tree node ~ on which the auxiliary tree fl with foot node ~//can be adjoined, include the following production: b\[..,,f\] ~ b\[..~/\].</Paragraph> <Paragraph position="29"> Start substitution: For each elementary tree node ~ marked for substitution on which the initial tree c~ with root node ?~r can be substituted, include the production t\[,\] --* t\[,r\].</Paragraph> <Paragraph position="30"> We will refer to productions generated by Rule i above as Type i productions. For example, Type 3 productions are of the form t\[..~/\] --* b\[..~\]. For further information concerning the compilation see Vijay-Shanker and Weir (1990). For present purposes, it is sufficient to note that the method directly embeds the standard notion of derivation in the rewriting process. To perform an adjunction, we move (by Rule 4) from the node adjoined at to the top of the root of the auxiliary tree. At the root, additional adjunctions might be performed. When returning from the foot of the auxiliary tree back to the node where adjunction occurred, rewriting continues at the bottom of the node (see Rule 5), not the top, so that no more adjunctions can be started at that node. Thus, the dependent nature of predicative adjunction is enforced because only a single adjunction can occur at any given node.</Paragraph> <Paragraph position="31"> In order to permit extended derivations, we must allow for multiple modifier tree adjunctions at a single node. There are two natural ways this might be accomplished, as depicted in Figure 8.</Paragraph> <Paragraph position="32"> 1. Modified start foot ofadjunction rule: Allow moving from the bottom of the foot of a modifier auxiliary tree to the top (rather than the bottom) of the node at which it adjoined (Figure 8b).</Paragraph> <Paragraph position="33"> 2. Modified start root of adjunction rule: Allow moving from the bottom (rather than the top) of a node to the top of the root of a modifier auxiliary tree (Figure 8c).</Paragraph> <Paragraph position="34"> As can be seen from the figures, both of these methods allow recursion at a node, unlike the original method depicted in Figure 8a. Thus multiple modifier trees are allowed to adjoin at a single node. Note that since predicative trees fall under the original rules, at most a single predicative tree can be adjoined at a node. The two Schematic structure of possible predicative and modifier adjunctions with top and bottom of each node separated.</Paragraph> <Paragraph position="35"> methods correspond exactly to the innermost- and outermost-predication methods discussed in Section 4.3. For the reasons described there, the latter is preferred. TM In summary, independent derivation structures can be allowed for modifier auxiliary trees by starting the adjunction process from the bottom, rather than the top of a node for those trees. Thus, we split Type 4 LIG productions into two subtypes for predicative and modifier trees, respectively.</Paragraph> <Paragraph position="36"> 4a.</Paragraph> <Paragraph position="37"> 4b.</Paragraph> <Paragraph position="38"> Start root of predicative adjunction: For each elementary tree node 7/on which the predicative auxiliary tree fl with root node T\]F can be adjoined, include the following production: t\[..,\] ~ t\[..~p?~\].</Paragraph> <Paragraph position="39"> Start root of modifier adjunction: For each elementary tree node ~/on which the modifier auxiliary tree fl with root node ~/r can be adjoined, include the following production: b\[..~/\] ~ t\[.3l~lr \].</Paragraph> <Paragraph position="40"> Once this augmentation has been made, we no longer need to allow for adjunctions at the root nodes of modifier auxiliary trees, as repeated adjunction is now allowed for 14 The more general definition allowing predicative trees to occur anywhere within a sequence of modifier adjunctions would be achieved by adding both types of rules. Computational Linguistics Volume 20, Number 1 by the new rule 4b. Consequently, grammars should forbid adjunction of a modifier tree fll at the root of a modifier tree f12 except where fll is intended to modify /32 directly.</Paragraph> <Paragraph position="41"> This simple modification to the compilation process from TAG to LIG fully specifies the modified notion of derivation. Note that the extra criterion (5) noted in Section 3.4 is satisfied by this definition: modifier adjunctions are inherently repeatable and eliminable as the movement through the adjunction &quot;loop&quot; ends up at the same point that it begins. The recognition algorithms for TAG based on this compilation, however, must be adjusted to allow for the new rule types.</Paragraph> <Paragraph position="42"> This compilation makes possible a simple proof of the weak-generative equivalence of TAGs under the standard and extended derivations, is Call the set of languages generable by a TAG under the standard definition of derivation TALs and under the extended definition TALe. Clearly, TALs c TALe since the standard definition can be mimicked by making all auxiliary trees predicative. The compilation above provides the inclusion TALe C LIL, where LIL is the set of linear indexed languages. The final inclusion LIL C_ TALs has been shown indirectly by Vijay-Shanker (1987) using embedded push-down automata and modified head grammars as intermediaries. From these inclusions, we can conclude that TALs = TALe.</Paragraph> </Section> </Section> <Section position="7" start_page="107" end_page="116" type="metho"> <SectionTitle> 6. Recognition and Parsing </SectionTitle> <Paragraph position="0"> A recognition algorithm for TAGs can be constructed based on the above translation into corresponding LIGs as specified by Rules 1 through 6 in the previous section. The algorithm is not a full recognition algorithm for LIGs, but rather, is tuned for exactly the types of rules generated as output of this compilation process. In this section, we present the recognition algorithm and modify it to work with the extended derivation compilation.</Paragraph> <Paragraph position="1"> We will use the following notations in this and later sections. The symbol P will serve as a variable over the two LIG grammar nonterminals t and b. The substring of the string wl ... Wn being parsed between indices i and j will be notated as wi+t &quot;. wj, which we take to be the empty string when i is greater than or equal to j. We will use p, A, and {9 for sequences containing terminals and LIG nonterminals with their stack specifications. For instance, F might be t\[rll\]t\[..rl2\]t\[rl3 \].</Paragraph> <Paragraph position="2"> The parsing algorithm can be seen as a tabular parsing method based on deduction of items, as in Earley deduction (Pereira and Warren 1983). We will so describe it, by presenting inference rules over items of the form</Paragraph> <Paragraph position="4"> Such items play the role of the items of Earley's algorithm. Unlike the items of Earley's algorithm, however, an item of this form does not embed a grammar rule proper; that is, P\[7/\] --+ pA is not necessarily a rule of the grammar. Rather, it is what we will call a reduced rule; for reasons described below, the nonterminals in F and A as well as the nonterminal P\[~/\] record only the top element of each stack of indices. We will use the notation P\[~\] --+ pA for the unreduced form of the rule whose reduced form is p\[~/\] --+ pA. For instance, the rule specified by the notation t\[~/1\] --+ t\[712\] might be the rule t\[..~l\] --+ t\[..~1~\]2\]. The reader can easily verify that the TAG to LIG compilation is such that there is a one-to-one correspondence between the generated rules and their reduced form. Consequently, this notation is well defined.</Paragraph> <Paragraph position="5"> 15 We are grateful to K. Vijay-Shanker for bringing this point to our attention.</Paragraph> <Paragraph position="6"> Yves Schabes and Stuart M. Shieber Tree-Adjoining Derivation The dot in the items is analogous to that found in Earley and LR items as well. It serves as a marker for how far recognition has proceeded in identifying the subconstituents for this rule. The indices i, j, k, and l specify the portion of the string Wl .. * w~ covered by the recognition of the item. The substring between i and 1 (i.e., wi+ 1 &quot;'&quot; Wl) has been recognized, perhaps with a region between j and k where the foot of the tree below the node ~ has been recognized. (If the foot node is not dominated by F, we take the values of j and k to be the dummy value '-'.)</Paragraph> <Section position="1" start_page="108" end_page="110" type="sub_section"> <SectionTitle> 6.1 The Inference Rules </SectionTitle> <Paragraph position="0"> In this section, we specify several inference rules for parsing a LIG generated from a TAG, which we recall in this section. One explanatory comment is in order, however, before the rules are presented. The rules of a LIG associate with each constituent a nonterminal and a stack of indices. It seems natural for a parsing algorithm to maintain this association by building items that specify for each constituent the full information of nonterminal and index stack. However, this would necessitate storing an unbounded amount of information for each potential constituent, resulting in a parsing algorithm that is potentially quite inefficient when nondeterminism arises during the parsing process, and perhaps noneffective if the grammar is infinitely ambiguous. Instead, the parse items manipulated by the inference rules that we present do not keep all of this information for each constituent. Rather, the items keep only the single top stack element for each constituent (in addition to the nonterminal symbol). This drastically decreases the number of possible items and accounts for the polynomial character of the resultant algorithm. 16 Side conditions make up for some of the loss of information, thereby maintaining correctness. For instance, the Type 4 Completor rule specifies a relation between ~ and ~/f that takes the place of popping an element off of the stack associated with ~. However, the side conditions are strictly weaker than maintaining full stack information. Consequently, the algorithm, though correct, does not maintain the valid prefix property. See Schabes (1991) for further discussion and alternatives.</Paragraph> <Paragraph position="1"> Scanning and prediction work much as in Earley's original algorithm.</Paragraph> <Paragraph position="3"> Note that the only rules that need be considered are those where the parent is a bottom node, as terminal symbols occur on the right-hand side only of Type 1 or 2 productions. Otherwise, the rule is exactly as that for Earley's algorithm except that the extra foot indices (j and k) are carried along.</Paragraph> <Paragraph position="5"> This rule serves to form predictions for any type production in the grammar, as the variables P and P' range over the values t and b. In the 16 Vijay-Shanker and Weir (1990) first proposed the recording of only the top stack element in order to achieve efficient parsing. The algorithm they presented is a bottom-up general LIG parsing algorithm. Schabes (1991) sketches a proof of an O(n 6) bound for an Earley-style algorithm for TAG parsing that is more closely related to the algorithm proposed here.</Paragraph> <Paragraph position="6"> Computational Linguistics Volume 20, Number 1 predicted item, the foot is not dominated by the (empty) recognized input, so that the dummy value '-' is used for the foot indices. Note that the predicted item records the reduced form of an unreduced rule P'\[~/'\] --* (9 of the grammar.</Paragraph> <Paragraph position="7"> Completion of items (moving of the dot from left to right over a nonterminal) breaks up into several cases, depending on which production type is being completed. This is because the addition of the extra indices and the separate interpretations for top and bottom productions require differing index manipulations to be performed. We will list the various steps, organized by what type of production they participate in the completion of. .</Paragraph> <Paragraph position="8"> Productions that specify immediate domination (from Rules I and 2) are completed whenever the top of the child node is fully recognized.</Paragraph> <Paragraph position="10"> Here, t\[7/\] has been fully recognized as the substring between i and I. The item expecting t\[~\] can be completed. One of the two antecedent items might also dominate the foot node of the tree to which ~/and 71 belong, and would therefore have indices for the foot substring. The operations j U j' and k U k' are used to specify whichever of j or j' (and respectively for k or k') contain foot substring indices. The formal definition of U is as follows:</Paragraph> <Paragraph position="12"> This rule is used to complete a prediction that no (predicative) adjunction occurs at node ~/. Once the part of the string dominated by b\[~/\] has been found, as evidenced by the second antecedent item, the prediction of no adjunction can be completed.</Paragraph> <Paragraph position="14"> Yves Schabes and Stuart M. Shieber Tree-Adjoining Derivation Here, an adjunction has been predicted at 7, and the adjoined derived tree (between t\[~\] and b\[~\]) and the derived material that r\] itself dominates (below b\[r\]\]) have both been completed. Thus t\[~\] is completely recognized. Note that the side condition (the unreduced form of the reduced rule in the first antecedent item) is placed merely to guarantee that ~/r is the root node of an adjoinable auxiliary tree.</Paragraph> <Paragraph position="16"> When adjunction has been performed and recognition up to the foot node ~f has been performed, it is necessary to recognize all the material under the foot node. When that is done, the foot node prediction can be completed. Note that it must be possible to have adjoined the auxiliary tree at node r/as specified in the production in the side condition.</Paragraph> <Paragraph position="18"> Completion of the material below the root node ~r of an initial tree allows for the completion of the node at which substitution occurred.</Paragraph> <Paragraph position="19"> The recognition process for a string wl * .. Wn starts with some items that serve as axioms for these inference rules. For each rule t\[~ls\] --* F where ~s is the root node of an initial tree whose node is labeled with the start nonterminal, the item (t\[~s\] -~ * F, 0, -, -, 0) is an axiom. If from these axioms an item of the form (t\[~s\] --~ P *, 0, -, -, n) can be proved according to the rules of inference above, the string is accepted; otherwise it is rejected.</Paragraph> <Paragraph position="20"> Alternatively, the axioms can be stated as if there were extra rules S --* t\[r/s\] for each ~/s a start-nonterminal-labeled root node of an initial tree. In this case, the axioms are items of the form (S --~ * t\[~s\], 0, -,-, 0) and the string is accepted upon proving IS --+ t\[~/s\] *, 0,-,-, n). In this case, an extra prediction and completion rule is needed just for these rules, since the normal rules do not allow S on the left-hand side. This point is taken up further in Section 6.4.</Paragraph> <Paragraph position="21"> Generation of items can be cached in the standard way for inference-based parsing algorithms (Shieber 1992); this leads to a tabular or chart-based parsing algorithm.</Paragraph> </Section> <Section position="2" start_page="110" end_page="110" type="sub_section"> <SectionTitle> 6.2 The Algorithm Invariant </SectionTitle> <Paragraph position="0"> The algorithm maintains an invariant that holds of all items added to the chart. We will describe the invariant using some additional notational conventions. Recall that P\[~\] -+ 1 ~ is the LIG production in the grammar whose reduced form is P\[~\] --+ P. The notation F\[7\] where 7 is a sequence of stack symbols (i.e., nodes), specifies the sequence F with 7 replacing the occurrence of .. in the stack specifications. For example, if P is the sequence t\[rll\]t\[..rl2\]t\[~13 \], then F\[3,\] = t\[r\]l\]t\['yrl2\]t\[~3\]. A single LIG derivation step will be notated with ~ and its reflexive transitive closure with 3*.</Paragraph> <Paragraph position="1"> 11i Computational Linguistics Volume 20, Number 1 The invariant specifies that (P\[~\] ~ E * A, i,j, k, 1) is in the chart only if 17 1. If node ~ dominates the foot node ~f of the tree to which it belongs, then there exists a string of stack symbols (i.e., nodes) &quot;y such that (a) P\[~\] --. PA is a LIG rule in the grammar, where E is the unreduced form of F.</Paragraph> <Paragraph position="2"> (b) F\[Tz/\] o* * Wi+ 1. . wjb\[v?~f\]Wk+I . . . W l (c) b{Tnr\] o* wj+t...wk 2. If node ~ does not dominate the foot node ~f of the tree to which it belongs or there is no foot node in the tree, then (a) P\[7/\] --. PA is a LIG rule in the grammar, where F is the unreduced form of E.</Paragraph> <Paragraph position="3"> (b) F =:k* Wi+I&quot;''W 1 (c) j and k are not bound.</Paragraph> <Paragraph position="4"> According to this invariant, for a node ~/s that is the root of an initial tree, the item (t\[z\]s\] --+ P., 0,-,-, n) is in the chart only if t\[~?s\] ~ E ~* Wl''&quot; W n. Thus, soundness of the algorithm as a recognizer follows.</Paragraph> </Section> <Section position="3" start_page="110" end_page="112" type="sub_section"> <SectionTitle> 6.3 Modifications for Extended Derivations </SectionTitle> <Paragraph position="0"> Extending the algorithm to allow for the new types of production (specifically, as derived by Rule 4b) requires adding a completion rule for Type 4b productions. For the new type of production, a completion rule of the following form is required:</Paragraph> <Paragraph position="2"> In addition to being able to complete Type 4b items, we must also be able to complete other items using completed Type 4b items. This is an issue in particular for completor rules that might move their dot over a b\[~\] constituent; in particular, the Type 3 and 5 Completors. However, these rules have been stated so that the antecedent item with right-hand side b\[~\] already matches Type 4b items. Furthermore, the general statement, including index manipulation is still appropriate in the context of Type 4b productions. Thus, no further changes to the recognition inference rules are needed for this purpose.</Paragraph> <Paragraph position="3"> 17 The invariant is not stated as a biconditional because this would require strengthening of the antecedent condition. The natural strengthening, following the standard for Earley's algorithm, would be to add a requirement that the item be consistent with left context, as (d) 7/s ~* Wl&quot;&quot; wiP\[7&quot;q\] but this is too strong. This condition implies that the algorithm possesses the valid prefix property, which it does not. The exact statement of the invariant condition that would allow for exact specifications of the item semantics is the topic of ongoing research. However, the current specification is sufficient for proving soundness of the algorithm.</Paragraph> <Paragraph position="4"> Yves Schabes and Stuart M. Shieber Tree-Adjoining Derivation However, a bit of care must be taken in the interpretation of the Type 1/2 Completor. Type 4b items that require completion bear a superficial resemblance to Type 1 and 2 items, in that both have a constituent of the form t\[_\] after the dot. In Type 4b items, the constituent is tier\], in Type 4a items t\[71\]. But it is crucial that the Type 1/2 Completor not be used to complete Type 4b items. A simple distinguishing characteristic is that in Type 1 and 2 items to be completed, the node ~/after the dot is never a root node (as it is immediately dominated by 71), whereas in Type 4b items, the node ~r after the dot is always a root node (of a modifier tree). Simple side conditions can distinguish the cases.</Paragraph> <Paragraph position="5"> Figure 9 contains the final versions of the inference rules for recognition of LIGs corresponding to extended TAG derivations.</Paragraph> </Section> <Section position="4" start_page="112" end_page="114" type="sub_section"> <SectionTitle> 6.4 Maintaining Derivation Structures </SectionTitle> <Paragraph position="0"> One of the intended applications for extended derivation TAG parsing is the parsing of synchronous TAGs. Especially important in this application is the ability to generate the derivation trees while parsing proceeds.</Paragraph> <Paragraph position="1"> A synchronous TAG is composed of two base TAGs (which we will call the source TAG and the target TAG) whose elementary trees have been paired one-to-one. A synchronous TAG whose source TAG is a grammar for a fragment of English and whose target TAG is a grammar for a logical form language may be used to generate logical forms for each sentence of English that the source grammar admits (Shieber and Schabes 1990). Similarly, with source and target swapped, the synchronized grammar may be used to generate English sentences corresponding to logical forms (Shieber and Schabes 1991). If the source and target grammars specify fragments of natural languages, an automatic translation system is specified (Abeill6, Schabes, and Joshi 1990).</Paragraph> <Paragraph position="2"> Abstractly viewed, the processing of a synchronous grammar proceeds by parsing an input string according to the source grammar, thereby generating a derivation tree for the string; mapping the derivation tree into a derivation tree for the target grammar; and generating a derived tree (hence, derived string) according to the target grammar.</Paragraph> <Paragraph position="3"> One frequent worry about synchronous TAGs as used in their semantic interpretation mode is whether it is possible to perform incremental interpretation. The abstract view of processing just presented seems to require that a full derivation tree be developed before interpretation into the logical form language can proceed. Incremental interpretation, on the other hand, would allow partial interpretation results to guide the parsing process on-line, thereby decreasing the nondeterminism in the parsing process. Whether incremental interpretation is possible depends precisely on the extent to which the three abstract phases of synchronous TAG processing can in fact be interleaved. In previous work we left this issue open. In this section, we allay these worries by showing how the extended TAG parser just presented can build derivation trees incrementally as parsing proceeds. Once this has been demonstrated, it should be obvious that these derivation trees could be transferred to target derivation trees during the parsing process and immediately generated from. Thus, incremental interpretation is demonstrated to be possible in the synchronous TAG framework. In fact, the technique presented in this section has allowed for the first implementation of synchronous TAG processing, by Onnig Dombalagian. This implementation was directly based on the inference-based TAG parser mentioned in Section 6.5 and presented in full elsewhere (Schabes and Shieber 1992).</Paragraph> <Paragraph position="4"> We associate with each item a set of operations that have been implicitly carried out by the parser in recognizing the substring covered by the item. An operation can be characterized by a derivation tree and a tree address at which the derivation tree is</Paragraph> <Paragraph position="6"> to be placed; it corresponds roughly to a branch of a derivation tree. Prediction items have the empty set of operations. Type 4 and 6 completion steps build new elements of the sets as they correspond to actually carrying out adjunction and substitution operations, respectively. Other completion steps merely pool the operations from their constituent parts.</Paragraph> <Paragraph position="7"> In describing the building of derivation trees, we will use normal set notation for the sets of derivation trees. We will assume that for each node r/, there are functions tree(rl) and addr(rl) that specify, respectively, the initial tree that ~ occurs in and its address in that tree. Finally, we will use a constructor function for derivation trees Yves Schabes and Stuart M. Shieber Tree-Adjoining Derivation deriv(% S), where &quot;7 specifies an elementary tree and S specifies a set of operations on it. An operation is built with op(t, D) where t is a tree address and D is a derivation tree to be operated at that address.</Paragraph> <Paragraph position="8"> Figure 10 lists the previously presented recognition rules augmented to build derivation structures as the final component of each item. The axioms for this inference system are items of the form (S --* * t\[~ls\], 0,-,-, 0, {}), where we assume as in Section 6.1 that there are extra rules S ~ t\[~s\] for each ~s a start-nonterminal-labeled root node of an initial tree. We require an extra rule for prediction and completion to handle this new type of rule. The predictor rule is the obvious analog:</Paragraph> <Paragraph position="10"> In fact, the existing predictor rule could have been easily generalized to handle this case.</Paragraph> <Paragraph position="11"> The completor for these start rules is the obvious analog to a Type 6 completor, except in the handling of the derivation. It delivers, instead of a set of derivation operations, a single derivation tree.</Paragraph> <Paragraph position="13"> The string is accepted upon proving (S-+ t\[r/s\]. , 0,-,-, n, D), where D is the derivation developed during the parse.</Paragraph> </Section> <Section position="5" start_page="114" end_page="116" type="sub_section"> <SectionTitle> 6.5 Complexity Considerations </SectionTitle> <Paragraph position="0"> The inference system of Section 6.3 essentially specifies a parsing algorithm with complexity of O(n 6) in the length of the string. Adding explicit derivation structures to the items, as in the inference system of the previous section, eliminates the polynomial character of the algorithm in that there may be an unbounded number of derivations corresponding to any given item of the original sort. Even for finitely ambiguous grammars, the number of derivations may be exponential. Nonetheless, this fact does not vitiate the usefulness of the second algorithm, which maintains derivations explicitly. The point of this augmentation is to allow for incremental interpretation--for interleaved processing of a post-syntactic sort--so as to guide the parsing process in making choices on-line. By using the extra derivation information, the parser should be able to eliminate certain nondeterministic paths of computation; otherwise, there is no reason to do the interpretation incrementally. But this determinization of choice presumably decreases the complexity. Thus, the extra information is designed for use in cases where the full search space is not intended to be explored.</Paragraph> <Paragraph position="1"> Of course, a polynomial shared-forest representation of the exponential number of derivations could have been maintained (by maintaining back pointers among the items in the standard fashion). For performing incremental interpretation for the purpose of determinization of parsing, however, the non-shared representation is sufficient, and preferable on grounds of ease of implementation and expository convenience. null As a proof of concept, the parsing algorithm just described was implemented in Prolog on top of a simple, general-purpose, agenda-based inference engine. Encodings of explicit inference rules are essentially interpreted by the inference engine. The Prolog database is used as the chart; items not already subsumed by a previously generated item are asserted to the database as the parser runs. An agenda of potential new items is maintained. Items are added to the agenda as inference rules are triggered by items added to the chart. Because the inference rules are stated explicitly, the relation between the abstract inference rules described in this paper and the implementation is extremely transparent. As a meta-interpreter, the prototype is not particularly effi- null Yves Schabes and Stuart M. Shieber Tree-Adjoining Derivation cient. (In particular, the implementation does not achieve the theoretical O(n 6) bound on complexity, because of a lack of appropriate indexing.) Code for the prototype implementation is available for distribution electronically from the authors.</Paragraph> </Section> </Section> <Section position="8" start_page="116" end_page="116" type="metho"> <SectionTitle> 7. Conclusion </SectionTitle> <Paragraph position="0"> The precise formulation of derivation for tree-adjoining grammars has important ramifications for a wide variety of uses of the formalism, from syntactic analysis to semantic interpretation and statistical language modeling. We have argued that the definition of tree-adjoining derivation must be reformulated in order to take greatest advantage of the decoupling of derivation tree and derived tree by manifesting the proper linguistic dependencies in derivations. The particular proposal is both precisely characterizable through a definition of TAG derivations as equivalence classes of ordered derivation trees and computationally operational by virtue of a compilation to linear indexed grammars together with an efficient algorithm for recognition and parsing according to the compiled grammar.</Paragraph> </Section> class="xml-element"></Paper>