File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/87/e87-1034_intro.xml

Size: 11,752 bytes

Last Modified: 2025-10-06 14:04:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="E87-1034">
  <Title>DISCONTINUOUS CONSTITUENTS IN TREES, RULES, AND PARSING</Title>
  <Section position="3" start_page="204" end_page="206" type="intro">
    <SectionTitle>
(11) S
</SectionTitle>
    <Paragraph position="0"> which children did nn get ifts fr m The technique of using phrases that miss some constituent cannot be used for at least some of the examples (3)-(8), such as (5) and (7). In both these sentences the discontinuous NP contains a full-fledged NP, which cannot sensibly be said to &amp;quot;miss&amp;quot; the relative clause or prepositional phrase that occurs later in the sentence.</Paragraph>
    <Paragraph position="1"> Whatever techniques may be invented to deal with such cases, it seems obvious that a grammar which recognizes and describes discontinuities in natural language sentences is a more suitable basis for semantic interpretation than one that squeezes constituent structures in a form in which they cannot be represented.</Paragraph>
    <Paragraph position="2"> It therefore seems worth investigating the viability of tree-like structures with discontinuities, like (9) and (11).</Paragraph>
    <Paragraph position="3"> 2. Trees with discontinuities If we want to represent the situation that a phrase P has constituents A and C, while there is an intervening phrase B, we must allow the node corresponding to P to dominate the A and C nodes without dominating the B, even though this node is located between the A and C nodes: (12) P A B C One consequence of allowing such discontinuities is that our structures get crossing branches, if we still want all nodes to be connected to the top node; (10) and (11) illustrate this. In what respects exactly do these structures differ from ordinary trees? McCawley (1982) has tried to answer this question, suggesting a formal definition for trees with discontinuities by amending the definition of a tree.</Paragraph>
    <Paragraph position="4"> A tree is often defined as a set of elements, called &amp;quot;nodes&amp;quot;, on which two relations are defined, immediate dominance (D) and linear precedence (&lt;), which are required to have certain properties to the effect that a tree has exactly one root node, which dominates every other node (immediately or indirectly); that every node in a tree has exactly one &amp;quot;mother&amp;quot; node, etc. (see e.g. Wall, 1972).</Paragraph>
    <Paragraph position="5"> Given the relations of immediate dominance and linear precedence, dominance is defined as the reflexive and transitive closure D' of D, and adjacency as linear precedence without intervening nodes.</Paragraph>
    <Paragraph position="6"> A node in a tree is called terminal if it does not dominate any other node; the terminal nodes in a tree are totally ordered by the &lt; relation. For nonterminal nodes the precedence relation satisfies the requirement that x &lt; y if and only if every node dominated by x precedes every node dominated by y.</Paragraph>
    <Paragraph position="7"> Formally: (13) for any two nodes x and y in the node set of a tree, x &lt; y if and only if for all nodes u and v, if x dominates u and y dominates v, then u &lt; v.</Paragraph>
    <Paragraph position="8"> Part of the definition of a tree is also the stipulation that any two nodes either dominate or precede one another: (14) for any two nodes x and y in the node set of a tree, either x D' y, or y D' x, or x &lt; y, or y &lt; x.</Paragraph>
    <Paragraph position="9"> This stipulation has the effect of excluding discontinuities in a tree, for suppose a node x would dominate nodes y and z without having a dominance relation with node w, where y &lt; w &lt; z. By (14), either x &lt; w or w &lt; x. But x dominates a node to the right of w, so by (13) x does not precede w; and w is to the right of a node dominated by x, so w does not precede x either.</Paragraph>
    <Paragraph position="10"> McCawley's definition of trees with discontinuities comes down to dropping the condition that any two nodes should either dominate one another or have a left-right relation. Instead, he proposes the weaker condition that a node has no precedence relation to any node that it dominates: (15) for any two nodes x and y in the node set of a tree, if x D' y then neither x &lt; y nor y &lt; x.</Paragraph>
    <Paragraph position="11"> We shall call a node u, situated between daughters of a node x without being dominated by x, internal context of X.</Paragraph>
    <Paragraph position="12">  McCawley's definition of trees with discontinuities is inaccurate in several respects; however, his general idea is certainly correct : trees with discontinuities can be defined essentially by relaxing condition (14) in the definition of trees.</Paragraph>
    <Paragraph position="13"> However, this is only the beginning of what needs to be done. The next question is how discontinuous trees can be produced by phrase-structure rules. This question, which is not addressed by McCawley, is far from trivial and turns out to have interesting consequences for the notion of  adjacency in discontinuous tre es.</Paragraph>
    <Paragraph position="14"> 3. Adjacency in phrase-structure rules for  discontinuous constituents A phrase-structure rule rewrites a constituent into a sequence of pairwise adjacent constituents. This means that we need a notion of adjacency in discontinuous trees, for which the obvious definition, given the &lt; relation, would seem to be: (16) two nodes x and y in the node set of a tree are adjacent if and only if x &lt; y and there is no z such that x &lt; z &lt; y.</Paragraph>
    <Paragraph position="15"> We shall write &amp;quot;x + y&amp;quot; to indicate that x and y are adjacent (or &amp;quot;neighbours&amp;quot;). A moment's reflection shows that this notion of adjacency unfortunately does not help us in formulating rules that could do a n y thi n g w i t h in t e rnal context constituents. The following example illustrates this. Suppose we want to generate the discontinuous tree structure: (17) VP /k Wake your friend up To generate the top node, we need a rule combining the V and the NP, like: (18) VP --&gt; V + NP Since the V dominates nodes at either side of the NP, however, there is no left-right order between the NP and V nodes, leave alone a neighbour relation. For the same reason there would be no left-right relation between overlapping discontinuous constituents, as in (19). These deficiencies can be remedied by replacing clause (14) in the definition of a tree by the more general clause (20).  (19) VP g NP Wake the man up who lives next door.</Paragraph>
    <Paragraph position="16"> (20) A nonterminal node x in a tree is  to the left of a node y in the tree if and only if x's leftmost daughter is left of y's leftmost daughter.</Paragraph>
    <Paragraph position="17"> (We refrain here from a formal definition of &amp;quot;leftmost daughter&amp;quot; node, which is intuitively obvious.) Note that (20) is indeed a generalization of the usual notion of precedence in trees, which could also be defined by (20). The recursion in (20) comes to an end since the terminal nodes are required to be totally ordered.</Paragraph>
    <Paragraph position="18"> It should also be noted that (20) is not consistent with clause (14): by (2@), we do get a precedence relation between a node and its daughter nodes (except the leftmost one) and internal context nodes. This is not quite unreasonable. In (21), for example, we do want that X &lt; Y, and</Paragraph>
    <Paragraph position="20"> since Y &lt; C, that X &lt; C, but not that X &lt; B. We therefore adapt clause (14) to the effect that a mother node only precedes internal context nodes and daughter nodes which have internal context nodes to their left. Formally: (22) For any nodes x and z in the node set N of a tree, if x D z and there are no nodes u,v in N such that x D u, not x D v, and u &lt; v &lt; z, then neither x &lt; z nor z &lt; x.</Paragraph>
    <Paragraph position="21"> With the modifications (16) and (22), we have a consistent definition of &amp;quot;discontinuous trees&amp;quot; which allows us to write phrase-structure rules containing discontinuous constituents as follows: (23) X --&gt; A + B + \[Y\] + C where the square brackets indicate that the NP is not dominated by the X node, but is only internal context. The &amp;quot;+&amp;quot; symbol represents the notion of adjacency, defined as before but now on the basis of te revised precedence relation &amp;quot;&lt;&amp;quot;:  (24) Two nodes x and y in a tree are adjacent if and only if x &lt; y and there is no node z in the tree such that x &lt; z &lt; y.</Paragraph>
    <Paragraph position="22"> Upon closer inspection, the neighbour relation defined in this way is unsatisfactory, however, as the following example illustrates.</Paragraph>
    <Paragraph position="23"> Suppose we want to generate the following (part of a) tree structure:  However, this rule would be of no help here, since P, Q and E do not form a sequence of adjacency pairs, as Q and E are not adjacent according to our definition. Rather, the correct rule for generating (25) would be (27): (27) S --&gt; P + Q + \[C\] + \[D\] + E This is ugly, and even uglier rules are required in more complex trees with discontinuities at different levels.</Paragraph>
    <Paragraph position="24"> Moreover, there seems to be something fundamentally wrong, since the C and D nodes are on the one hand internal context for the S node, according to rule (27), while on the other hand they are also dominated by S. That is, these nodes are both &amp;quot;real&amp;quot; constituents of S and internal context of S.</Paragraph>
    <Paragraph position="25"> To remedy this, we introduce a new concept of adjacency sequence, which generalizes the traditional notion of a sequence of adjacency pairs. The definition goes as follows:  (28) A sequence (a, b, ..., n) is an (n-place) adjacency sequence if and only if: (i) every pair (i,j) in the  sequence is either an adjacency pair or is connected by a sequence of adjacency pairs of which all members are a constituent of some element in the subsequence (a, b,..., i); (ii) the elements in the sequenc~ do not share any constituents. .) For example, in the structure (25) the triple (P, Q, E) is an adjacency sequence since (P, Q) is an adjacency pair and Q and E are connected by the sequence of adjacency pairs Q-C-D-E, with C and D constituents of P and Q, respectively. Another example of an adjacency sequence in (25) is the triple (P, B, D). The triple (P, B, C), on the other hand, is not an adjacency sequence, since P and C share the constituent C.</Paragraph>
    <Paragraph position="26"> The use of this notion of adjacency sequence is now that the sequence of constituents, into which a nonterminal is rewritten by a phrase-structure rule, forms an adjacency sequence in this sense. The phrase-structure grammar consisting of rules of this kind we call Discontinuous Phrase-Structure Grammar or DPSG. ~j It may be worth emphasizing that this notion of phrase-structure rule is a generalization of the usual notion, since an adjacency sequence as defined by (28) subsumes the usual notion of sequence of adjacency pairs. We have also seen that trees with discontinuities are a generalization of the traditional tree concept. Therefore, phrase-structure rules of the familiar sort coincide with DPSG rules without discontinuous constituents, and they produce the familiar sort of trees without discontinuities . In other words, DPSG-rules can simply be added to a classical PSG (including GPSG ,-~'--~ith the result that the grammar generates trees with discontinuities for sentences with discontinuous constituents, while doing everything else as before.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML