File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/95/j95-3005_metho.xml

Size: 24,245 bytes

Last Modified: 2025-10-06 14:14:00

<?xml version="1.0" standalone="yes"?>
<Paper uid="J95-3005">
  <Title>Squibs and Discussions Memoization in Top-Down Parsing</Title>
  <Section position="2" start_page="0" end_page="409" type="metho">
    <SectionTitle>
2. Formalizing Context-Free Grammars
</SectionTitle>
    <Paragraph position="0"> It is fairly straightforward to implement a top-down parser in a functional programming language. The key insight is that a nonterminal category A in a grammar defines a function fA that maps a string position 1 in the input string 7 to a set of string positions fA(l) such that r C fA(1) iff A can derive the substring of &amp;quot;7 spanning string positions I to r (see e.g., Leermakers \[1993\] for discussion).</Paragraph>
    <Paragraph position="1"> For example, suppose V, gP, and S are already bound to fv, fwP and fs, and the grammar contains the following productions with VP on the left hand side.</Paragraph>
    <Paragraph position="2">  (1) VP --+ V NP VP --+ V S Then the following Scheme definition binds vp to fvP.</Paragraph>
    <Paragraph position="3"> (2) (define (VP p) (union (reduce union '() (map NP (V p)))  (reduce union '() (map S (V p)))))) If sets are represented by unordered lists, union can be given the following definition. The function reduce is defined such that an expression of the form (reduce  haves as a depth-first, top-down recognizer in which nondeterminism is simulated by backtracking. For example, in (2) the sequence V NP is first investigated as a potential analysis of VP, and then the sequence V S is investigated.</Paragraph>
    <Paragraph position="4"> Rather than defining the functions f by hand as in (2), higher-order functions can be introduced to automate this task. It is convenient to use suffixes of the input string to represent the string positions of the input string (as in DCGs). The expression (terminal x) evaluates to a function that maps a string position I to the singleton set { r } iff the terminal x spans from I to r, and the empty set otherwise.  Mark Johnson Memoization in Top-Down Parsing (5) (define (terminal X)</Paragraph>
    <Paragraph position="6"> The expression (seq fA fB) evaluates to a function that maps a string position 1 to the set of string positions {ri} such that there exists an m 6 fA(1), and ri 6 fB(rrl). Informally, the resulting function recognizes substrings that are the concatenation of a substring recognized by fA and a substring recognized by f~.</Paragraph>
    <Paragraph position="8"> (reduce union '() (map B (A p))))) The expression (alt fA fB) evaluates to a function that maps a string position 1 to fa(l) U fB(1). Informally, the resulting function recognizes the union of the substrings recognized by fA and fB.</Paragraph>
    <Paragraph position="10"> While terminal, seq, and alt suffice to define (epsilon-free) context-free grammars, we can easily define other useful higher-order functions. For example, epsilon recognizes the empty string (i.e., it maps every string position 1 into the singleton set {1}), (opt fA) recognizes an optional constituent, and (k* f,O recognizes zero or more occurrences of the substrings recognized by fA.</Paragraph>
    <Paragraph position="11">  These higher-order functions can be used to provide simpler definitions, such as (2a) or (2b), for the function VP defined in (2) above.</Paragraph>
    <Paragraph position="13"> This method of defining the functions corresponding to categories is quite appealing.</Paragraph>
    <Paragraph position="14"> Unfortunately, Scheme is deficient in that it does not allow mutually recursive functional definitions of the kind in (2a) or (2b). For example, suppose S is defined as in  (11) and VP is defined as in (2a).</Paragraph>
    <Paragraph position="15"> (11) (define S (seq NP VP))  Computational Linguistics Volume 21, Number 3 Further, suppose (11) precedes (2a) textually in the program. Then the variable VP in (11) will be incorrectly interpreted as unbound. Changing the order of the definitions will not help, as then the variable S will be unbound. ~ A work-around is to add a vacuous lambda abstraction and application as in (11a), in effect delaying the evaluation of function definition.</Paragraph>
    <Paragraph position="16">  (11a) (define S (lambda args (apply (seq NP VP) args))) With a macro definition such as (12) (named to remind us of this deficiency in the current Scheme specification and perhaps encourage the language designers to do better in the future), the definition of functions such as (11a) can be written as (11b). (12) (define-syntax vacuous</Paragraph>
    <Paragraph position="18"> (lambda args (apply fn args))))) (11b) (define S (vacuous (seq NP VP))) Figure 1 contains a fragment defined in this way. After these definitions have been loaded, an expression such as the one in (13) can be evaluated. It returns a list of the input string's suffixes that correspond to the right string position of an S. (13) &gt; (s '(Kim knows every student likes Sandy)) ((likes sandy) ()) In example (13), the list resulting from the evaluation contains two suffixes, corresponding to the fact that both Kim knows every student and Kim knows every student likes Sandy can be analysed as Ss.</Paragraph>
    <Paragraph position="19"> Finally, the recognize predicate can be defined as follows. The expression (recognize words) is true iff words is a list of words that can be analysed as an S, i.e., if the empty string is a one of right string positions of an S whose left string position is the whole string to be recognized.</Paragraph>
    <Paragraph position="20">  (14) (define (recognize words) (member '() (S words))) 3. Memoization and Left Recursion  As noted above, the Scheme functions defined in this way behave as top-down, back-tracking recognizers. It is well known that such parsing methods suffer from two major problems.</Paragraph>
    <Paragraph position="21"> 1 This problem can arise even if syntactic constructions specifically designed to express mutual recursion are used, such as letrec. Although these variables are closed over, their values are not applied when the defining expressions are evaluated, so such definitions should not be problematic for an applicative-order evaluator. Apparently Scheme requires that mutually recursive functional expressions syntactically contain a lambda expression. Note that this is not a question of reduction strategy (e.g., normal-order versus applicative-order), but an issue about the syntactic scope of variables.  A CFG &amp;agmentdefined using the highe~orderconstructors.</Paragraph>
    <Paragraph position="22"> First, a top-down parser using a left-recursive grammar typically fails to terminate on some inputs. This is true for recognizers defined in the manner just described; left-recursive grammars yield programs that contain ill-founded recursive definitions. 2 Second, backtracking parsers typically involve a significant amount of redundant computation, and parsing time is exponential in the length of the input string in the worst case. Again, this is also true for the recognizers just described. Memoization is a standard technique for avoiding redundant computation, and as Norvig (1991) noted, it can be applied to top-down recognizers to convert exponentialtime recognizers into polynomial-time recognizers.</Paragraph>
    <Paragraph position="23"> A general way of doing this is by defining a higher-order procedure memo that takes a function as an argument and returns a memoized version of it. 3 This procedure is  essentially the same as the memoize predicate that is extensively discussed in Abelson and Sussman (1985).</Paragraph>
    <Paragraph position="24"> (15) (define (memo fn)  To memoize the recognizer, the original definitions of the functions should be replaced with their memoized counterparts; e.g., (llb) should be replaced with (11c). Clearly these definitions could be further simplified with suitable macro definitions or other 'syntactic sugar.' 2 Specifically, if A is a Scheme variable bound to the function corresponding to a left-recursive category, then for any string position p the expression (A p) reduces to another expression containing (A p). Thus the (applicative-order) reduction of such expressions does not terminate. 3 For simplicity, the memo procedure presented in (15) stores the memo table as an association list, in general resulting in a less than optimal implementation. As Norvig notes, more specialized data structures, such as hash tables, can improve performance. In the parsing context here, optimal performance would probably be obtained by encoding string positions with integers, allowing memo table lookup to be a single array reference.</Paragraph>
    <Paragraph position="25">  Computational Linguistics Volume 21, Number 3 (11c) (define S (memo (vacuous (seq NP VP)))) As an aside, it is interesting to note that memoization can be applied selectively in this approach. For example, because of the overhead of table lookup in complex feature-based grammars, it might be more efficient not to memoize all categories, but rather restrict memoization to particular categories such as NP and S.</Paragraph>
    <Paragraph position="26"> Now we turn to the problem of left recursion. In a logic programming setting, memoization (specifically, the use of Earley deduction) avoids the nontermination problems associated with left recursion, even when used with the DCG axiomatization of a left-recursive grammar. But as Norvig mentions in passing, with parsers defined in the manner just described, the memoized versions of programs derived from left-recursive grammars fail to terminate.</Paragraph>
    <Paragraph position="27"> It is easy to see why. A memo-ed procedure constructs an entry in a memo table only after the result of applying the unmemoized function to its arguments has been computed. Thus in cases of left recursion, memoization does nothing to prevent the ill-founded recursion that leads to nontermination.</Paragraph>
    <Paragraph position="28"> In fact it is not clear how memoization could help in these cases, given that we require that memo behaves semantically as the identity function; i.e., that (memo f) and f are the same function. Of course, we could try to weaken this identity requirement (e.g., by only requiring that (fx) and ((memo f) x) are identical when the reduction of the former terminates), but it is not clear how to do this systematically. Procedurally speaking, it seems as if memoization is applying 'too late' in the left-recursive cases; reasoning by analogy with Earley deduction, we need to construct an entry in the memo table when such a function is called; not when the result of its evaluation is known. Of course, in the left recursive cases this seems to lead to an inconsistency, since these are cases where the value of an expression is required to compute that very value.</Paragraph>
    <Paragraph position="29"> Readers familiar with Abelson and Sussman (1985) will know that in many cases it is possible to circumvent such apparent circularity by using asynchronous 'lazy streams' in place of the list representations (of string positions) used above. The continuation-passing style encoding of CFGs discussed in the next section can be seen as a more functionally oriented instantiation of this kind of approach.</Paragraph>
  </Section>
  <Section position="3" start_page="409" end_page="409" type="metho">
    <SectionTitle>
4. Formalizing Relations in Continuation-Passing Style
</SectionTitle>
    <Paragraph position="0"> The apparent circularity in the definition of the functions corresponding to left-recursive categories suggests that it may be worthwhile reformulating the recognition problem in such a way that the string position results are produced incrementally, rather than in one fell swoop, as in the formalization just described. The key insight is that each nonterminal category A in a grammar defines a relation rA such that rA(l, r) iff A can derive the substring of the input string spanning string positions I to r. 4 Informally speaking, the r can be enumerated one at a time, so the fact that the calculation of rA(l, r) requires the result rA(l, r') need not lead to a vicious circularity.</Paragraph>
    <Paragraph position="1"> One way to implement this in a functional programming language is to use a 'Continuation-Passing Style' (CPS) of programming, s It turns out that a memoized</Paragraph>
  </Section>
  <Section position="4" start_page="409" end_page="411" type="metho">
    <SectionTitle>
4 The relation rA and the function fA mentioned above satisfy V r ~/l rA(l, r) ~ r C f(l). 5 Several readers of this paper, including a reviewer, suggested that this can be formulated more
</SectionTitle>
    <Paragraph position="0"> succinctly using Scheme's call/cc continuation-constructing primitive. After this paper was accepted for publication, Jeff Sisskind devised an implementation based on call/cc which does not require continuations to be explicitly passed as arguments to functions.</Paragraph>
    <Paragraph position="1">  Mark Johnson Memoization in Top-Down Parsing top-down parser written in continuation-passing style will in fact terminate, even in the face of left recursion. Additionally, the treatment of memoization in a CPS is instructive because it shows the types of table lookup operations needed in chart parsing.</Paragraph>
    <Paragraph position="2"> Informally, in a CPS program an additional argument, call it c, is added to all functions and procedures. When these functions and procedures are called c is always bound to a procedure (called the continuation); the idea is that a result value v is 'returned' by evaluating (c v). For example, the standard definition of the function square in (16) would be rewritten in CPS as in (17). (18) shows how this definition could be used to compute and display (using the Scheme builtin display) the square of the number 3.</Paragraph>
    <Paragraph position="3">  (16) (define (square x) (* x x)) (17) (define (square cont x) (cont (* x x))) (18) &gt; (square display 3)  Thus whereas result values in a non-CPS program flow 'upwards' in the procedure call tree, in a CPS program result values flow 'downwards' in the procedure call tree. 6,7 The CPS style of programming can be used to formalize relations in a pure functional language as procedures that can be thought of as 'returning' multiply valued results any number of times.</Paragraph>
    <Paragraph position="4"> These features of CPS can be used to encode CFGs as follows. Each category A is associated with a function gA that represents the relation rA, i.e., (gA C I) reduces (in an applicative-order reduction) in such a fashion that at some stage in the reduction the expression (c r) is reduced iff A can derive the substring spanning string positions I to r of the input string. (The value of (gA c I) is immaterial and therefore unspecified, but see footnote 8 below). That is, if (gA C I) is evaluated with l bound to the left string position of category A, then (c r) will be evaluated zero or more times with r bound to each of A's right string positions r corresponding to I.</Paragraph>
    <Paragraph position="5"> For example, a CPS function recognizing the terminal item 'will' (arguably a future auxiliary in a class of its own) could be written as in (19).</Paragraph>
    <Paragraph position="6"> (19) (define (future-aux continuation pos) (if (and (pair? pos) (eq? (car pos) (continuation (cdr pos)))) 'will)) For a more complicated example, consider the two rules defining VP in the fragment above, repeated here as (20). These could be formalized as the CPS function defined in (21).</Paragraph>
    <Paragraph position="7">  Computational Linguistics Volume 21, Number 3 In this example V, NP, and S are assumed to have CPS definitions. Informally, the expression (lambda (poe1) (NP continuation posl)) is a continuation that specifies what to do if a V is found, viz., pass the V's right string position posl to the NP recognizer as its left-hand string position, and instruct the NP recognizer in turn to pass its right string positions to continuation.</Paragraph>
    <Paragraph position="8"> The recognition process begins by passing the function corresponding to the root category the string to be recognized, and a continuation (to be evaluated after successful recognition) that records the successful analysis. 8  Thus rather than constructing a set of all the right string positions (as in the previous encoding), this encoding exploits the ability of the CPS approach to 'return' a value zero, one or more times (corresponding to the number of right string positions). And although it is not demonstrated in this paper, the ability of a CPS procedure to 'return' more than one value at a time can be used to pass other information besides right string position, such as additional syntactic features or semantic values.</Paragraph>
    <Paragraph position="9"> Again, higher-order functions can be used to simplify the definitions of the CPS functions corresponding to categories. The CPS versions of the terminal, se% and alt functions are given as (23), (25), and (24) respectively.</Paragraph>
    <Paragraph position="10"> (23) (define (terminal word) (lambda (continuation poe) (if (and (pair? poe) (eq? (car poe) word)) (continuation (cdr poe))))) 8 Thus this formaliza~on makes use of mutability to return final results, and so cannot be expressed in a purely func~onal language. Howeve~ it is possible to construct a similiar formalization in the purely functional subset of Scheme by passing around an additional 'result' argument (here the last argument). The examples above would be rewritten as the following under this approach.  If these three functions definitions replace the earlier definitions given in (5), (6), and (7), the fragment in Figure I defines a CPS recognizer. Note that just as in the first CFG encoding, the resulting program behaves as a top-down recognizer. Thus in general these progams fail to terminate when faced with a left-recursive grammar for essentially the same reason: the procedures that correspond to left-recursive categories involve ill-founded recursion.</Paragraph>
  </Section>
  <Section position="5" start_page="411" end_page="415" type="metho">
    <SectionTitle>
5. Memoization in Continuation-Passing Style
</SectionTitle>
    <Paragraph position="0"> The memo procedure defined in (15) is not appropriate for CPS programs because it associates the arguments of the functional expression with the value that the expression reduces to, but in a CPS program the 'results' produced by an expression are the values it passes on to the continuation, rather than the value that the expression reduces to. That is, a memoization procedure for a CPS procedure should associate argument values with the set of values that the unmemoized procedure passes to its continuation. Because an unmemoized CPS procedure can produce multiple result values, its memoized version must store not only these results, but also the continuations passed to it by its callers, which must receive any additional results produced by the original unmemoized procedure.</Paragraph>
    <Paragraph position="1"> The cps-memo procedure in (26) achieves this by associating a table entry with each set of argument values that has two components; a list of caller continuations and a list of result values. The caller continuation entries are constructed when the memoized procedure is called, and the result values are entered and propagated back to callers each time the unmemoized procedure 'returns' a new value. 9  Specifically, when the memoized procedure is called, continuation is bound to the continuation passed by the caller that should receive 'return' values, and args is bound to a list of arguments that index the entry in the memo table and are passed to the unmemoized procedure cps-fn if evaluation is needed. The memo table table initially associates every set of arguments with empty caller continuation and empty result value sets. The local variable entry is bound to the table entry that corresponds to args; the set of caller continuations stored in entry is null iff the memoized function has not been called with this particular set of arguments before.</Paragraph>
    <Paragraph position="2"> The cond clause determines if the memoized function has been called with args before by checking if the continuations component of the table entry is nonempty.</Paragraph>
    <Paragraph position="3"> In either case, the caller continuation needs to be stored in the continuations component of the table entry, so that it can receive any additional results produced by the unmemoized procedure.</Paragraph>
    <Paragraph position="4"> If the memoized procedure has not been called with args before, it is necessary to call the unmemoized procedure cps-fn to produce the result values for args. The continuation passed to cps-fn checks to see if each result of this evaluation is subsumed by some other result already produced for this entry; if it is not, it is pushed onto the results component of this entry, and finally passed to each caller continuation associated with this entry.</Paragraph>
    <Paragraph position="5"> If the memoized procedure has been called with args before, the results associated with this table entry can be reused. After storing the caller continuation in the table entry, each result already accumulated in the table entry is passed to the caller continuation.</Paragraph>
    <Paragraph position="6"> Efficient implementations of the table and entry manipulation procedures would be specialized for the particular types of arguments and results used by the unmemoized procedures. Here we give a simple and general, but less than optimal, implementation using association lists. 1deg 10 This formalization makes use of 'impure' features of Scheme, specifically destructive assignment to add an element to the table list (which is why this list contains the dummy element &amp;quot;head*). Arguably,  Mark Johnson Memoization in Top-Down Parsing A table is a headed association list (27), which is extended as needed by table-ref (28). In this fragment there are no partially specified arguments or results (such as would be involved if the fragment used feature structures), so the subsumption relation is in fact equality.</Paragraph>
    <Paragraph position="7">  (member result (entry-results entry))) As claimed above, the memoized version of the CPS top-down parser does terminate, even if the grammar is left-recursive. Informally, memoized CPS top-down parsers terminate in the face of left-recursion because they ensure that no unmemoized procedure is ever called twice with the same arguments. For example, we can replace the definition of NP in the fragment with the left-recursive one given in (35) without compromising termination, as shown in (36) (where the input string is meant to approximate Kim's professor knows every student).</Paragraph>
    <Paragraph position="8">  Computational Linguistics Volume 21, Number 3 Memoized CPS top-down recognizers do in fact correspond fairly closely to chart parsers. Informally, the memo table for the procedure corresponding to a category A will have an entry for an argument string position 1 just in case a predictive chart parser predicts a category A at position l, and that entry will contain string position r as a result just in case the corresponding chart contains a complete edge spanning from l to r. Moreover, the evaluation of the procedure PA corresponding to a category A at string position l corresponds to predicting A at position l, and the evaluation of the caller continuations corresponds to the completion steps in chart parsing. The CPS memoization described here caches such evaluations in the same way that the chart caches predictions, and the termination in the face of left recursive follows from the fact that no procedure PA is ever called with the same arguments twice. Thus given a CPS formalization of the parsing problem and an appropriate memoization technique, it is in fact the case that &amp;quot;the maintenance of well-formed substring tables or charts can be seen as a special case of a more general technique: memoization&amp;quot; (Norvig 1991), even if the grammar contains left recursion.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML