File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1101_metho.xml
Size: 16,231 bytes
Last Modified: 2025-10-06 14:14:55
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1101"> <Title>Finite-state Approximation of Constraint-based Grammars using Left-corner Grammar Transforms</Title> <Section position="3" start_page="619" end_page="619" type="metho"> <SectionTitle> 1.4 Finite-state approximations </SectionTitle> <Paragraph position="0"> We obtain a finite-state approximation to a top-down parser by restricting attention to only a finite number of possible stack states. The system implemented here imposes a stack depth restriction, i.e., the transition function is modified so that there are no transitions to any stack state whose size is larger than some user-specified limit. 1 This restriction ensures that there is only a finite number of possible stack states, and hence that the top down parser is an finite-state machine. The resulting finite-state machine accepts a subset of the language generated by the original grammar.</Paragraph> <Paragraph position="1"> The situation becomes more complicated when we move to 'unification-based' grammars, since there may be an unbounded number of different categories appearing in the accessible stack states. In the system implemented here we used restriction (Shieber, 1985) on the stack states to restrict attention to a finite number of distinct stack states for any given stack depth. Since the restriction operation maps a stack state to a more general one, it produces a finite-state approximation which accepts a superset of the language generated by the original unification grammar. Thus for general constraint-based grammars the language accepted by our finite-state approximation is not guaranteed to be either a superset or a subset of the language generated by the input grammar.</Paragraph> </Section> <Section position="4" start_page="619" end_page="620" type="metho"> <SectionTitle> 2 The left-corner transform </SectionTitle> <Paragraph position="0"> While conceptually simple, the top-down parsing algorithm presented in the last section suffers from a number of drawbacks for a finite-state approximation. For example, the number of distinct accessible stack states is unbounded if the grammar is left-recursive, yet left-linear grammars always generate regular languages. This section presents 1With the optimized left-corner transforms described below we obtain acceptable approximations with a stack size limit of 5 or less. In many useful cases, including the example grammar provided by Pereira and Wright (1991), this stack bound is never reached and the system reports that the FSA it returns is exact.</Paragraph> <Paragraph position="1"> the standard left-corner grammar transformation (Rosenkrantz and Lewis II, 1970; Aho and Ullman, 1972); these references should be consulted for proofs of correctness. This transform serves as the basis for the further transforms described in the next section; these transforms have the property that the output grammar induces a finite number of distinct accessible stack states if their input is a left-recursive left-linear grammar.</Paragraph> <Paragraph position="2"> Given an input grammar G with nonterminals N and terminals T, these transforms PSCi produce grammars with an enlarged set of nonterminals N t = N O (N x (N O T)). The new &quot;pair&quot; categories in N x (N U T) are written A-X, where A is a non-terminal of G and X is either a terminal or non-terminal of G. It turns out that if A =~* X7 then G A-X ~*~cI(G) 7, i.e., a non-terminal A-X in the transformed grammar derives the difference between A and X in the original grammar, and the notation is meant to be suggestive of this.</Paragraph> <Paragraph position="3"> The left-corner trans/orm of a CFG G = (N, T, P, S) is a grammar/2C1 (G) = (N', T, P1, S), where P1 contains all productions of the form (1.a1.c). This paper assumes that N n T = 0, as is standard. To save space we assume that P does not contain any epsilon productions (but it is straight-forward to deal with them).</Paragraph> <Paragraph position="4"> A --4 a A-a : A e N, a e T. (1.a) A-X --~ fl A-B : A e N, B -+ X fl e P. (1.b) A-A ~ e : A e N. (1.c) Informally, the productions (1.a) start the left-corner recognition of A by recognizing a terminal a as a possible left-corner of A. The actual left-corner recognition is performed by the productions (1.b), which extend the left-corner from X to its parent B by recognizing fl; these productions are used repeatedly to construct increasingly larger leftcorners. Finally, the productions (1.c) terminate the recognition of A when this left-corner construction process has constructed an A.</Paragraph> <Paragraph position="5"> The left-corner transform preserves the number of parses of a string, so it defines an isomorphism from analysis trees (i.e., parse trees with respect to G) to parse trees with respect to PSgl (G). If t is a parse tree with respect to G then (abusing notation) PSCl(t) is the corresponding parse tree with respect to PSCI(G). Figure 1 shows the effect of this mapping on a simple tree. The transformed tree is considerably more complex: it has double the number of nodes of the original tree. In a top-down parse of the tree PSCl(t) in Figure 1 the maximum stack depth is 3, which occurs at the recognition of the terminals ran and/ast.</Paragraph> <Section position="1" start_page="620" end_page="620" type="sub_section"> <SectionTitle> 2.1 Filtering useless categories </SectionTitle> <Paragraph position="0"> In general the grammar produced by the transform PSC/1(G) contains a large number of useless nonterminals, i.e., non-terminals which can never appear in any complete derivation, even if the grammar G is fully pruned (i.e., contains no useless productions).</Paragraph> <Paragraph position="1"> While PSC1(G) can be pruned using standard algorithms, given the observation about the relationship between the pair non-terminals in PS:C1 (G) and non-terminals in G, it is clear that certain productions can be discarded immediately as useless. Define the lef-eorner relation C/ C (N U T) x N as follows: X ~A iff 3ft. A ~ Xfl E P, Let 4&quot; be the reflexive and transitive closure of 4. It is easy to show that a category A-X is useless in PSCI(G) (i.e., derives no sequence of terminals) unless X 4&quot; A. Thus we can restrict the productions in (1.a-l.c) without affecting the language (strongly) generated to those that only contain pair categories A-X where X 4&quot; A.</Paragraph> </Section> <Section position="2" start_page="620" end_page="620" type="sub_section"> <SectionTitle> 2.2 Unification grammars </SectionTitle> <Paragraph position="0"> One of the main advantages of left-corner parsing algorithms over LR(k) based parsing algorithms is that they extend straight-forwardly to complex feature based &quot;unification&quot; grammars. The transformation PSC1 itself can be encoded in several lines of Prolog (Matsumoto et al., 1983; Pereira and Shieber, 1987). This contrasts with the LR(k) methods. In LR(k) parsing a single LR state may correspond to several items or dotted rules, so it is not clear how the feature &quot;unification&quot; constraints should be associated with transitions from LR state to LR state (see Nakazawa (1995) for one proposal). In contrast, extending the techniques described here to complex feature based &quot;unification&quot; grammar is straight-forward.</Paragraph> <Paragraph position="1"> The main complication is the filter on useless non-terminals and productions just discussed. Generalizing the left-corner closure filter on pair categories to complex feature &quot;unification&quot; grammars in an efficient way is complicated, and is the primary difficulty in using left-corner methods with complex feature based grammars, van Noord (1997) provides a detailed discussion of methods for using such a &quot;left-corner filter&quot; in unification-grammar parsing, and the methods he discusses are used in the implementation described below.</Paragraph> </Section> </Section> <Section position="5" start_page="620" end_page="622" type="metho"> <SectionTitle> 3 Extended left-corner transforms </SectionTitle> <Paragraph position="0"> This section presents some simple extensions to the basic left-corner transform presented above. The 'tail-recursion' optimization permits bounded-stack parsing of both left and right linear constructions.</Paragraph> <Paragraph position="1"> Further manipulation of this transform puts it into a form in which we can identify precisely the tree configurations in the original grammar which cause the stack size of a left-corner parser to increase. These observations motivate the special binarization methods described in the next section, which minimize stack depth in grammars that contain productions of length no greater than two.</Paragraph> <Section position="1" start_page="621" end_page="621" type="sub_section"> <SectionTitle> 3.1 A tail-recursion optimization </SectionTitle> <Paragraph position="0"> If G is a left-linear grammar, a top-down parser using PS.C1 (G) can recognize any string generated by G with a constant-bounded stack size. However, the corresponding operation with right-linear grammars requires a stack of size proportional to the length of the string, since the stack fills with paired categories A-A for each non-left-corner nonterminal in the analysis tree.</Paragraph> <Paragraph position="1"> The 'tail recursion' or 'composition' optimization (Abney and Johnson, 1991; Resnik, 1992) permits right-branching structures to be parsed with bounded stack depth. It is the result of epsilon removal applied to the output of PSC1, and can be described in terms of resolution or partial evaluation of the transformed grammar with respect to productions (1.c). In effect, the schema (1.b) is split into two cases, depending on whether or not the rightmost nonterminal A-B is expanded by the epsilon rules produced by schema (1.c). This expansion yields a grammar L:C2 (G) = (N', T, P2, S), where P2 contains all productions of the form (2.a-2.c). (In these schemata A,B E N; a E T; X E N U T and fl E (NOT)*).</Paragraph> <Paragraph position="2"> A ~ a A-a (2.a)</Paragraph> <Paragraph position="4"> Figure 1 shows the effect of the transform L:C2 on the example tree. The maximum stack depth required for this tree is 2. When this 'tail recursion' optimization is applied, pair categories in the transformed grammar encode proper left-corner relationships between nodes in the analysis tree. This lets us strengthen the 'useless category' filter described above as follows. Let ,~+ be the transitive closure of the left-corner relation ~ defined above. It is easy to show that a category A-X is useless in L:C2(G) (i.e., derives no sequence of terminals) unless X,~ + A.</Paragraph> <Paragraph position="5"> Thus we can restrict the productions in (2.a-2.b) without affecting the language (strongly) generated to just those that only contain pair categories A-X where X 4 + A.</Paragraph> </Section> <Section position="2" start_page="621" end_page="622" type="sub_section"> <SectionTitle> 3.2 The special case of binary productions </SectionTitle> <Paragraph position="0"> We can get a better idea of the properties of transformation L:C2 if we investigate the special case where the productions of G are unary or binary. In this situation, transformation PSC2(G) can be more explicitly written as /:C3(G) = (N', T, P3, S), where characteristic of the use of production schema (4. 0 in transform PSC4. This is the only configuration which causes an increase in stack depth in a top-down parser using a grammar transformed with L:C4. unary and binary productions respectively in the original grammar. Now, note that nonterminals from N only appear in the right hand sides of productions of type (3.d) and (3.e). Moreover, any such nonterminals must be immediately expanded by a production of type (3.a). Thus these non-terminals are eliminable by resolving them with (3.a); the only remaining nonterminal is the start symbol S.</Paragraph> <Paragraph position="1"> This expansion yields a new transform PS:C4, where EC4(G) = ({S} U (N x (NUT)),T, P4,S). P4, defined in (4.a-4.g), still contains productions of type (3.a), but these only expand the start symbol, as all occurences of nonterminals in N have been resolved away. (In these schemata a E T; A, B, C, D E N and X E NUT).</Paragraph> <Paragraph position="2"> In the production schemata defining/2C4, (4.a-4.c) are copied directly from (3.a-3.c) respectively. The schemata (4.d-4.e) are obtained by instantiating Y in (3.d-3.e) to a terminal a E T, while the other two schemata (4.f-4.g) are obtained by instantiating Y in (3.d-3.e) with the right hand sides of (3.a). Figure 1 shows the result of applying the transformation PS1C4 to the example analysis tree t.</Paragraph> <Paragraph position="3"> The transform also simplifies the specification of finite-state machine approximations. Because all terminals are introduced as the left-most symbols in their productions, there is no need for terminal symbols to appear on the parser's stack, saving an epsilon transition associated with a stack push and an immediately following stack pop with respect to the standard left-corner algorithm. Productions (4.a) and (4.d-4.g) can be understood as transitions over a terminal a that replace the top stack element with a sequence of other elements, while the other productions can be interpreted as epsilon transitions that manipulate the stack contents accordingly.</Paragraph> <Paragraph position="4"> Note that the right hand sides of all of these productions except for schema (4.f) are right-linear. Thus instances of this schema are the only productions that can increase the stack size in a top-down parse with EC4(G), and the stack depth required to parse an analysis tree is the maximum number of &quot;zig-zag&quot; patterns in the path in the analysis tree from any terminal node to the root. Figure 2 sketches the configuration of nodes in the analysis trees in which instances of schemata (4.f) would be used in a parse using PSC4(G). This highly distinctive &quot;zig-zag&quot; or &quot;lightning bolt&quot; pattern does not occur at all in the example tree t in Figure 1, so the maximum required stack depth is 2. (Recall that in a traditional top-down parser terminals are pushed onto the stack and popped later, so initialization productions (4.a) cause two symbols to be pushed onto the stack). It follows that this finite state approximation is exact for left-linear and right-linear CFGs. Indeed, analysis trees that consist simply of a left-branching subtree followed by a right-branching subtree, such as the example tree t, are transformed into strictly right-branching trees by/:C4.</Paragraph> </Section> </Section> <Section position="6" start_page="622" end_page="622" type="metho"> <SectionTitle> 4 Implementation </SectionTitle> <Paragraph position="0"> This section provides further details of the finite-state approximator implemented in this research.</Paragraph> <Paragraph position="1"> The approximator is written in Sicstus Prolog. It takes a user-specifier Definite Clause Grammar G (without Prolog annotations) as input, which it binarizes and then applies transform/:C4 to.</Paragraph> <Paragraph position="2"> The implementation annotates each transition with the production it corresponds to (represented as a pair of a /2C4 schema number and a production number from G), so the finite-state approximation actually defines a transducer which transduces a lexical input to a sequence of productions which specify a parse of that input with respect to/:C4(G).</Paragraph> <Paragraph position="3"> A following program inverts the tree transform EC4, returning a corresponding parse tree with respect to G. This parse tree can be checked by performing complete unifications with respect to the original grammar productions if so desired. Thus the finite-state approximation provides an efficient way of determining if an analysis of a given input string with respect to a unification grammar G exists, and if so, it can be used to suggest such analyses.</Paragraph> </Section> class="xml-element"></Paper>