XML Viewer - p98-1106

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1106_metho.xml
Size: 17,821 bytes
Last Modified: 2025-10-06 14:14:55
<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1106">
  <Title>Pseudo-Projectivity: A Polynomially Parsable Non-Projective Dependency Grammar</Title>
  <Section position="3" start_page="647" end_page="648" type="metho">
    <SectionTitle>
3 Projective Dependency Grammars
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="647" end_page="648" type="sub_section">
      <SectionTitle>
Revisited
</SectionTitle>
      <Paragraph position="0"> We (informally) define a projective Dependency Grammar as a string-rewriting system 3 by giving a set of categories such as N, V and Adv, 4 a set of distinguished start categories (the root categories of well-formed trees), a mapping from strings to categories, and two types of rules: dependency rules which state hierarchical order (dominance) and LP rules which state linear order. The dependency rules are further sub-divided into subcategorization rules (or s-rules) and modification rules (or m-rules). Here are some sample s-rules:</Paragraph>
      <Paragraph position="2"> LP rules are represented as regular expressions (actually, only a limited form of regular expressions) associated with each category. We use the hash sign (#) to denote the position of the governor (head). For example: pl:Yt .... = (Adv)Nnom(Aux)Adv*#YobjAdv*Yt .... (5) 3We follow (Gaifman, 1965) throughout this paper by modeling a dependency grammar with a string-rewriting system. However, we will identify a derivation with its representation as a tree, and we will sometimes refer to symbols introduced in a rewrite step as &amp;quot;dependent nodes&amp;quot;. For a model of a DG based on tree-rewriting (in the spirit of Tree Adjoining Grammar (Joshi et al., 1975)), see (Nasr, 1995).</Paragraph>
      <Paragraph position="3"> 4In this paper, we will allow finite feature structures on categories, which we will notate using subscripts; e.g., Vtrans. Since the feature structures are finite, this is simply a notational variant of a system defined only with simple category labels.</Paragraph>
      <Paragraph position="4">  We will call this system generative dependency grammar or GDG for short.</Paragraph>
      <Paragraph position="5"> Derivations in GDG are defined as follows.</Paragraph>
      <Paragraph position="6"> In a rewrite step, we choose a multiset of dependency rules (i.e., a set of instances of dependency rules) which contains exactly one s-rule and zero or more m-rules. The left-hand side nonterminal is the same as that we want to rewrite. Call this multiset the rewrite-multiset.</Paragraph>
      <Paragraph position="7"> In the rewriting operation, we introduce a multiset of new nonterminals and exactly one terminal symbol (the head). The rewriting operation then must meet the following three conditions: * There is a bijection between the set of dependents of the instances of rules in the rewrite-multiset and the set of newly introduced dependents.</Paragraph>
      <Paragraph position="8"> * The order of the newly introduced dependents is consistent with the LP rule associated with the governor.</Paragraph>
      <Paragraph position="9"> * The introduced terminal string (head) is mapped to the rewritten category.</Paragraph>
      <Paragraph position="10"> As an example, consider a grammar containing the three dependency rules dl (rule 2), d2 (rule 3), and d3 (rule 4), as well as the LP rule Pl (rule 5). In addition, we have some lexical mappings (they are obvious from the example), and the start symbol is Yfinite: +. A sample derivation is shown in Figure 3, with the sentential form representation on top and the corresponding tree representation below.</Paragraph>
      <Paragraph position="11"> Using this kind of representation, we can derive a bottom-up parser in the following  straightforward manner. 5 Since syntactic and linear governors coincide, we can derive deterministic finite-state machines which capture both the dependency and the LP rules for a given governor category. We will refer to these FSMs as rule-FSMs, and if the governor is of category C, we will refer to a C-rule-FSM. In a rule-FSM, the transitions are labeled by categories, and the transition corresponding to the governor labeled by its category and a special mark (such as #). This transition is called the &amp;quot;head transition&amp;quot;.</Paragraph>
      <Paragraph position="12"> The entries in the parse matrix M are of the form (m, q), where rn is a rule-FSM and q a state of it, except for the entries in squares M(i, i), 1 &lt;: i &lt; n, which also contain category labels.</Paragraph>
      <Paragraph position="13"> Let wo'&amp;quot;wn be the input word. We initialize the parse matrix as follows. Let C be a category of word wi. First, we add C to M(i,i). Then, we add to M(i, i) every pair (m, q) such that m is a rule-FSM with a transition labeled C from a start state and q the state reached after that transition. 6 Embedded in the usual three loops on i, j, k, we add an entry (ml,q) to M(i,j) if (rnl,ql) is in M(k,j), (m2, q2) is in M(i, k-t-l), q2 is a final state of m2, m2 is a C-rule-FSM, and ml transitions from ql to q on C (a non-head transition). There is a special case for the head transitions in ml: ifk = i - 1, C is in M(i,i), ml is a Crule-FSM, and there is a head transition from ql to q in ml, then we add (ml, q) to M(i, j).</Paragraph>
      <Paragraph position="14"> The time complexity of the algorithm is O(n3\[GIQmax), where G is the number of rule-FSMs derived from the dependency and LP rules in the grammar and Qmax is the maximum number of states in any of the rule-FSMs.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="648" end_page="650" type="metho">
    <SectionTitle>
4 A Formalization of
PP-Dependency Grammars
</SectionTitle>
    <Paragraph position="0"> Recall that in a pseudo-projective tree, we make a distinction between a syntactic governor and a linear governor. A node can be &amp;quot;lifted&amp;quot; along a lifting path from being a dependent of its syntactic governor to being a dependent of its linear  governor, which must be an ancestor of the governor. In defining a formal rewriting system for pseudo-projective trees, we will not attempt to model the &amp;quot;lifting&amp;quot; as a transformational step in the derivation. Rather, we will directly derive the &amp;quot;lifted&amp;quot; version of the tree, where a node is dependent of its linear governor. Thus, the derived structure resembles more a unistratal dependency representation like those used by (Hudson, 1990) than the multistratal representations of, for example, (Mel'~uk, 1988). However, from a formal point of view, the distinction is not significant.</Paragraph>
    <Paragraph position="1"> In order to capture pseudo-projectivity, we will interpret rules of the form (2) (for subcategorization of arguments by a head) and (4) (for selection of a head by an adjunct) as introducing syntactic dependents which may lift to a higher linear governor. An LP rule of the form (5) orders all linear dependents of the linear governor, no matter whose syntactic dependents they are.</Paragraph>
    <Paragraph position="2"> In addition, we need a third type of rule, namely a lifting rule, or l-rule (see 2.3). The</Paragraph>
    <Paragraph position="4"> This rule resembles normal dependency rules but instead of introducing syntactic dependents of a category, it introduces a lifted dependent.</Paragraph>
    <Paragraph position="5"> Besides introducing a linear dependent LD, a 1-rule should make sure that the syntactic governor of LD will be introduced at a later stage of the derivation, and prevent it to introduce LD as its syntactic dependent, otherwise non projective nodes would be introduced twice, a first time by their linear governor and a second time by their syntactic governor. This condition is represented in the rule by means of a constraint on the categories found along the lifting path.</Paragraph>
    <Paragraph position="6"> This condition, which we call the lifting condition, is represented by the regular expression LG. w SG. The regular expression representing the lifting condition is enriched with a dot separating, on its left, the part of the lifting path which has already been introduced during the rewriting and on its right the part which is still to be introduced for the rewriting to be valid.</Paragraph>
    <Paragraph position="7"> The dot is an unperfect way of representing the current state in a finite state automaton equivalent to the regular expression. We can further notice that the lifting condition ends with a repetition of LD for reasons which will be made clear when discussing the rewriting process.</Paragraph>
    <Paragraph position="8"> A sentential form contains terminal strings and categories paired with a multiset of lifting conditions, called the lift multiset. The lift multiset associated to a category C contains 'transiting' lifting conditions: introduced by ancestors of C and passing across C.</Paragraph>
    <Paragraph position="9"> Three cases must be distinguished when rewriting a category C and its lifting multiset LM: * LM contains a single lifting condition which dot is situated to its right: LGw SG C.. In such acase, Cmust be rewritten by the empty string. The situation of the dot at the right of the lifting condition indicates that C has been introduced by its syntactic governor although it has already been introduced by its linear governor earlier in the rewriting process.</Paragraph>
    <Paragraph position="10"> This is the reason why C has been added at the end of the lifting condition.</Paragraph>
    <Paragraph position="11"> * LM contains several lifting conditions one of which has its dot to the right. In such a case, the rewriting fails since, in accordance with the preceding case, C must be rewritten by the empty string. Therefore, the other lifting conditions of LM will not be satisfied. Furthermore, a single instance of a category cannot anchor more than one lifting condition.</Paragraph>
    <Paragraph position="12"> * LM contains several lifting conditions none of which having the dot to their right. In this case, a rewrite multiset of dependency rules and lifting rules, both having C as their left hand side, is selected. The result of the rewriting then must meet the following conditions:  lift multiset of all the newly introduced dependents D should be compatible with D, with the dot advanced appropriately. null In addition, we require that, when we rewrite a category as a terminal, the lift multiset is empty.</Paragraph>
    <Paragraph position="13"> Let us consider an example. Suppose we have have a grammar containing the dependency rules dl (rule 2), d2 (rule 3), and d3 (rule 4); the LP rule Pl (rule 5) and p2: p2:Vclause : (Ntop: + INwh:+)(Adv)Nnom(Aux)Adv* #Adv* Vt .... Furthermore, we have the following 1-rule: II :Vbridge:+---~Nc ..... bj top:+ {'V~ridge:+VNc ..... bj top:+ } This rule says that an objective wh-noun with feature top:+ which depends on a verb with no further restrictions (the third V in the lifting path) can raise to any verb that dominates its immediate governor as long as the raising paths contains only verb with feature bridge:+, i.e., bridge verbs.</Paragraph>
    <Paragraph position="14">  A sample derivation is shown in Figure 4, with the sentential form representation on top  and the corresponding tree representation below. We start our derivation with the start symbol Vclause and rewrite it using dependency rules d2 and d3, and the lifting rule ll which introduces an objective NP argument. The lifting condition of I1 is passed to the V dependent but the dot remains at the left of V'bridge:. {. because of the Kleene star. When we rewrite the embedded V, we choose to rewrite again with Yclause , and the lifting condition is passed on to the next verb. This verb is a Ytrans which requires a Yobj. The lifting condition is passed to Nob j and the dot is moved to the right of the regular expression, therefore Nob j is rewritten as the empty string.</Paragraph>
  </Section>
  <Section position="5" start_page="650" end_page="651" type="metho">
    <SectionTitle>
5 A Polynomial Parser for PP-GDG
</SectionTitle>
    <Paragraph position="0"> In this section, we show that pseudo-projective dependency grammars as defined in Section 2.3 are polynomially parsable.</Paragraph>
    <Paragraph position="1"> We can extend the bottom-up parser for GDG to a parser for PP-GDG in the following manner. In PP-GDG, syntactic and linear governors do not necessarily coincide, and we must keep track separately of linear precedence and of lifting (i.e., &amp;quot;long distance&amp;quot; syntactic dependence). The entries in the parse matrix M are of the form (m,q, LM), where m is a rule-FSM, q a state of m, and LM is a multiset of lifting conditions as defined in Section 4. An entry (m, q, LM) in a square M(i, j) of the parse matrix means that the sub-word wi...wj of the entry can be analyzed by m up to state q (i.e., it matches the beginning of an LP rule), but that nodes corresponding to the lifting rules in LM are being lifted from the subtrees spanning wi...wj. Put differently, in this bottom-up view LM represents the set of nodes which have a syntactic governor in the subtree spanning wi...wj and a lifting rule, but are still looking for a linear governor.</Paragraph>
    <Paragraph position="2"> Suppose we have an entry in the parse matrix M of the form (m, q, L). As we traverse the C-rule-FSM m, we recognize one by one the linear dependents of a node of category C. Call this governor ~?. The action of adding a new entry to the parse matrix corresponds to adding a single new linear dependent to 77. (While we are working on the C-rule-FSM m and are not yet in a final state, we have not yet recognized ~? itself.) Each new dependent ~?' brings with it a multiset  of nodes being lifted from the subtree it is the root of. Call this multiset LM'. The new entry will be (m, q', LM U LM') (where q' is the state ! , that m transitions to when ~? is recognized as the next linear dependent.</Paragraph>
    <Paragraph position="3"> When we have reached a final state q of the rule-FSM m, we have recognized a complete subtree rooted in the new governor, ~?. Some of the dependent nodes of ~? will be both syntactic and linear dependents of ~?, and the others will be linear dependents of ~?, but lifted from a descendent of 7. In addition, 77 may have syntactic dependents which are not realized as its own linear dependent and are lifted away. (No other options are possible.) Therefore, when we have reached the final state of a rule-FSM, we must connect up all nodes and lifting conditions before we can proceed to put an entry (m, q, L) in the parse matrix. This involves these steps: 1. For every lifting condition in LM, we ensure that it is compatible with the category of ~?. This is done by moving the dot leftwards in accordance with the category of 77. (The dot is moved leftwards since we are doing bottom-up recognition.) The obvious special provisions deal with the Kleene star and optional elements.</Paragraph>
    <Paragraph position="4"> If the category matches a catgeory with Kleene start in the lifting condition, we do not move the dot. If the category matches a category which is to the left of an optional category, or to the left of category with Kleene star, then we can move the dot to the left of that category.</Paragraph>
    <Paragraph position="5"> If the dot cannot be placed in accordance with the category of 77, then no new entry is made in the parse matrix for ~?.</Paragraph>
    <Paragraph position="6"> 2. We then choose a multiset of s-, m-, and 1rules whose left-hand side is the category of ~?. For every dependent of 77 introduced by an 1-rule, the dependent must be compatible with an instance of a lifting condition in LM (whose dot must be at its beginning, or seperated from the beginning by optional or categories only); the lifting condition is then removed from L.</Paragraph>
    <Paragraph position="7"> 3. If, after the above repositioning of the dot and the linking up of all linear dependents to lifting conditions, there are still lifting .</Paragraph>
    <Paragraph position="8"> conditions in LM such that the dot is at the beginning of the lifting condition, then no new entry is made in the parse matrix for ~?.</Paragraph>
    <Paragraph position="9"> For every syntactic dependent of ?, we determine if it is a linear dependent of ~ which has not yet been identified as lifted. For each syntactic dependents which is not also a linear dependent, we check whether there is an applicable lifting rule. If not, no entry is made in the parse matrix for 77. If yes, we add the lifting rule to LM.</Paragraph>
    <Paragraph position="10"> This procedure determines a new multiset LM so we can add entry (m, q, LM) in the parse matrix. (In fact, it may determine several possible new multisets, resulting in multiple new entries.) The parse is complete if there is an entry (m, qrn, O) in square M(n, 1) of the parse matrix, where m is a C-rule-FSM for a start category and qm is a final state of m. If we keep backpointers at each step in the algorithm, we have a compact representation of the parse forest. null The maximum number of entries in each square of the parse matrix is O(GQnL), where G is the number of rule-FSMs corresponding to LP rules in the grammar, Q is the maximum number of states in any of the rule-FSMs, and L is the maximum number of states that the lifting rules can be in (i.e., the number of lifting conditions in the grammar multiplied by the maximum number of dot positions of any lifting condition). Note that the exponent is a grammar constant, but this number can be rather small since the lifting rules are not lexicalized - they are construction-specific, not lexemespecific. The time complexity of the algorithm is therefore O(GQn3+21L\[).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML