File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/89/j89-4001_abstr.xml
Size: 6,558 bytes
Last Modified: 2025-10-06 13:46:46
<?xml version="1.0" standalone="yes"?> <Paper uid="J89-4001"> <Title>A PARSING ALGORITHM FOR UNIFICATION GRAMMAR</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> 1 INTRODUCTION </SectionTitle> <Paragraph position="0"> Unrestricted unification grammars have the formal power of a Turing machine. Thus there is no algorithm that finds all parses of a given sentence in any unification grammar and always halts. Some unification grammar systems just live with this problem. Any general parsing method for definite clause grammar will enter an infinite loop in some cases, and it is the task of the grammar writer to avoid this. Generalized phrase structure grammar avoids the problem because it has only the formal power of context-free grammar (Gazdar et al.</Paragraph> <Paragraph position="1"> 1985), but according to Shieber (1985a) this is not adequate for describing human language.</Paragraph> <Paragraph position="2"> Lexical functional grammar employs a better solution. A lexical functional grammar must include a finitely ambiguous context-free grammar, which we will call the context-free backbone (Barton 1987). A parser for lexical functional grammar first builds the finite set of context-free parses of the input and then eliminates those that don't meet the other requirements of the grammar. This method guarantees that the parser will halt.</Paragraph> <Paragraph position="3"> This solution may be adequate for lexical functional grammars, but for other unification grammars finding a finitely ambiguous context-free backbone is a problem.</Paragraph> <Paragraph position="4"> In a definite clause grammar, an obvious way to build a context-free backbone is to keep only the topmost function letters in each rule. Thus the rule s ----> np(P,N) vp(P,N) becomes s-->npvp (In this example we use the notation of Pereira and Warren 1980, except that we do not put square brackets around terminals, because this conflicts with standard notation for context-free grammars.) Suppose we use a simple X-bar theory. Let major-category (Type, Barlevel) denote a phrase in a major category. A noun phrase may consist of a single noun, for instance, John. This suggests a rule like this: major-category (n,2) --~ major-category (n, 1) In the context-free backbone this becomes major-category --* major-category so the context-free backbone is infinitely ambiguous.</Paragraph> <Paragraph position="5"> One could devise more elaborate examples, but this one suffices to make the point: not every natural unification grammar has an obvious context-free backbone. Therefore it is useful to have a parser that does not require us to find a context-free backbone, but works directly on a unification grammar (Shieber 1985b).</Paragraph> <Paragraph position="6"> We propose to guarantee that the parsing problem is solvable by restricting ourselves to depth-bounded grammars. A unification grammar is depth-bounded if for every L > 0 there is a D > 0 such that every parse tree for a sentential form of L symbols has depth less than D. In other words, the depth of a tree is bounded by the length of the string it derives. A context-free grammar is depth-bounded if and only if every string of symbols is finitely ambiguous. We will generalize the notion of finite ambiguity to unification grammars and show that for unification grammars, depth-boundedness is a stronger property than finite ambiguity.</Paragraph> <Paragraph position="7"> Copyright 1989 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the CL reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission. 0362-613X/89/010219-232503.00 Computational Linguistics, Volume 15, Number 4, December 1989 219 Andrew Haas A Parsing Algorithm for Unification Grammar Depth-bounded unification grammars have more formal power than context-free grammars. As an example we give a depth-bounded grammar for the language xx, which is not context-free. Suppose the terminal symbols are a through z. We introduce function letters a' through z' to represent the terminals. The rules of the grammar are as follows, with e denoting the empty string.</Paragraph> <Paragraph position="9"> The reasoning behind the grammar should be clear-x(cons(a',cons(b',nil))) derives ab, and the first rule guarantees that every sentence has the form xx. The grammar is depth-bounded because the depth of a tree is a linear function of the length of the string it derives. A similar grammar can derive the crossed serial dependencies of Swiss German, which according to Shieber (1985a) no context-free grammar can derive. It is clear where the extra formal power comes from: a context-free grammar has a finite set of nonterminals, but a unification grammar can build arbitrarily large nonterminal symbols.</Paragraph> <Paragraph position="10"> It remains to show that there is a parsing algorithm for depth-bounded unification grammars. We have developed such an algorithm, based on the context-free parser of Graham et al. 1980, which is a table-driven parser. If we generalize the table-building algorithm to a unification grammar in an obvious way, we get an algorithm that is guaranteed to halt for all depth-bounded grammars (not for all unification grammars). Given that the tables can be built, it is easy to show that the parser halts on every input. This is not a special property of our parser--a straightforward bottom-up parser will also halt on all depth-bounded grammars, because it builds partial parse trees in order of their depth. Our contribution is to show that a simple algorithm will verify depth-boundedness when in fact it holds. If the grammar is not depth-bounded, the table-building algorithm will enter an infinite loop, and it is up to the grammar writer to fix this. In practice we have not found this troublesome, but it is still an unpleasant property of our method. Section 7 will describe a possible solution for this problem.</Paragraph> <Paragraph position="11"> Sections 2 and 3 of this paper define the basic concepts of our formalism. Section 4 proves the soundness and completeness of our simplest parser, which is purely bottom-up and excludes rules with empty right-hand sides. Section 5 admits rules with empty right sides, and section 6 adds top-down filtering. Sec-</Paragraph> </Section> class="xml-element"></Paper>