File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/01/j01-2005_abstr.xml
Size: 7,490 bytes
Last Modified: 2025-10-06 13:41:59
<?xml version="1.0" standalone="yes"?> <Paper uid="J01-2005"> <Title>Squibs and Discussions Nonminimal Derivations in Unification-based Parsing</Title> <Section position="2" start_page="0" end_page="278" type="abstr"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> Unification grammar is a term often used to describe a family of feature-based grammar formalisms, including GPSG (Gazdar et al. 1985), PATR-II (Shieber 1986), DCG (Pereira and Warren 1980), and HPSG (Pollard and Sag 1994). In an effort to formalize the common elements of unification-style grammars, Shieber (1992) developed a logic for describing them, and used this logic to define an abstract parsing algorithm. The algorithm uses the same set of operations as Earley's (1970) algorithm for context-free grammars, but modified for unification grammars.</Paragraph> <Paragraph position="1"> In this paper, we show that, under certain conditions, Shieber's algorithm produces unintended, spurious parses in addition to the intended ones. We call these spurious parses nonminimal derivations (or nonminimal parse trees), because they contain extra features which are not in the productions that license the parse, a We claim that such nonminimal derivations are invalid. The basis of our claim is that the unification operation as set union preserves minimality; thus any correct unification-based parsing algorithm should produce parses that contain all and only features from the licensing productions (i.e., minimal derivations or minimal parse trees). Nonminimal derivations are also undesirable in practice because, given a parse tree, we cannot tell whether a particular feature should be in the model or not unless we reconstruct the whole tree.</Paragraph> <Paragraph position="2"> Despite the nonminimal derivations, Shieber (1992) proved the correctness of his algorithm. As it turned out, his definition of parse tree, which his proof relied on, was the notions of derivation and parse tree are different. However, in this paper we focus on parse trees as the final result of derivation, thus we mean that a derivation is nonminimal when its result is a nonminimal parse, in contrast to a minimal derivation which produces a minimal parse. Unfortunately, formal definitions of minimal and nonminimal derivations are outside the scope of this short paper; interested readers are encouraged to read Tomuro (1999).</Paragraph> <Paragraph position="3"> (~) 2001 Association for Computational Linguistics Computational Linguistics Volume 27, Number 2</Paragraph> <Paragraph position="5"> ~l (head agr pers) - 3rd / P3 = (&quot;sleeps&quot;, ,I~ 3 : \] (head agr num) - sing ) I, (head tense} - pres not constraining enough to disallow nonminimal derivations. To solve this twofold problem, we propose an alternate definition of minimal parse tree for unification grammars, and present a modification to Shieber's algorithm which ensures minimality. It is important to note that the same spurious parses also occur in context-free parsing, specifically in Earley's algorithm. However, since the only information a constituent carries in context-free grammar is the grammar symbol, the spurious derivations only produce exactly the same results as the normal ones. When the algorithm is extended to unification grammar, however, these spurious parses are a problem.</Paragraph> <Paragraph position="6"> 2. Unification Grammar and Parse Trees Shieber (1992) defines a unification grammar as a 3-tuple (G, P, p0), where ~ is the vocabulary of the grammar, P is the set of productions, and P0 E P is the start production. G contains L, a set of labels (feature names); C, a set of constants (feature values); and W, a set of terminals. There are two kinds of productions in P: phrasal and lexical. A phrasal production is a 2-tuple (a, ~), where a is the arity of the rule (the number of right-hand-side \[RHS\] constituents), and ~ is a logical formula. Typically, q~ is a conjunction of equations of the form pl - p2 or pl -&quot; c, where pl, p2 E L* are paths, and c E C. In an equation, any path which begins with an integer i (1 < i < a) represents the ith RHS constituent of the rule. 2 A lexical production is a 2-tuple (w, ~), where w E W and q~ is the same as above, except that there are no RHS constituents.</Paragraph> <Paragraph position="7"> Figure 1 shows some example phrasal and lexical productions (P0 corresponds to the context-free rule S --+ NP VP and is the start production). Then a model M relates to a formula q~ by a satisfaction relation ~ as usual (M ~ ~), and when q~ is the formula in a production p = (a, ~), p is said to license M.</Paragraph> <Paragraph position="8"> Based on the logic above, Shieber defines a parse tree and the language of a grammar expressed in his formalism. To define a valid parse tree, he first defines the set of possible parse trees I1 = Ui>_0 Hi for a given grammar G, where each Eli is defined as follows: Definition A parse tree r is a model that is a member of the infinite union of sets of bounded-depth parse trees FI = Ui_>0 I1i, where each IIi is defined as: 2 Shieber (1992) also uses a path that begins with 0 for the left-hand-side (LHS) constituent of a rule. In this paper, we omit the 0 arcs and place the features of the LHS constituent directly at the root. This change does not affect the formalism for the purpose of this paper.</Paragraph> <Paragraph position="9"> Tomuro and Lytinen Nonminimal Derivations .</Paragraph> <Paragraph position="10"> .</Paragraph> <Paragraph position="11"> rio is the set of models 7- for which there is a lexical production</Paragraph> <Paragraph position="13"> In the second condition, the extraction operator, denoted by/, retrieves the feature structure found at the end of a particular path; so for instance 7-/<1) retrieves the first subconstituent on the RHS of the production that licenses 7-. In the definition above, II0 contains all models that satisfy any lexical production in the grammar, while Hi contains all models that satisfy a phrasal production, and whose subconstituents are all in UjGi I\]j.</Paragraph> <Paragraph position="14"> To specify what constitutes a valid parse for a particular sentence, the next step is to define the yield of a parse tree. It is defined recursively as follows: if 7- is licensed by some lexical production p = {w, q~/, then the yield of 7- is w; or if 7- is licensed by some phrasal production {a, q~} and O~ 1 ..... (X a are the yields of 7-/(1) ..... 7-/<a) respectively, then the yield of 7- is ~1 ... %.</Paragraph> <Paragraph position="15"> Finally, Shieber defines a valid parse tree 7- c II for sentence Wl ... wn as follows:</Paragraph> <Paragraph position="17"> The yield of 7- is Wl ... Wn 7- is licensed by the start production po Notice that this definition allows extra features in a parse tree, because a parse tree 7- is defined by the satisfaction relation (7- ~ ~), which allows the existence of features in the model that are not in the licensing production's formula. Given this definition, for any valid parse tree 7-, we can construct another parse tree 7-' by simply adding an arbitrary (nonnumeric) feature to any node in 7-. Such a parse tree T' is nonminimal because extra features are nonminimal with respect to the minimal features in the licensing productions. We will return to the issue of minimal and nonminimal parse trees in Section 4.</Paragraph> </Section> class="xml-element"></Paper>