File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/95/j95-3005_abstr.xml

Size: 3,912 bytes

Last Modified: 2025-10-06 13:48:23

<?xml version="1.0" standalone="yes"?>
<Paper uid="J95-3005">
  <Title>Squibs and Discussions Memoization in Top-Down Parsing</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> In a paper published in this journal, Norvig (1991) pointed out that memoization of a top-down recognizer program produces a program that behaves similiarly to a chart parser. This is not surprising to anyone familiar with logic-programming approaches to natural language processing (NLP). For example, the Earley deduction proof procedure is essentially a memoizing version of the top-down selected literal deletion (SLD) proof procedure employed by Prolog. Pereira and Warren (1983) showed that the steps of the Earley Deduction proof procedure proving the well-formedness of a string S from the standard 'top-down' definite clause grammar (DCG) axiomatization of a context-free grammar (CFG) G correspond directly to those of Earley's algorithm recognizing S using G.</Paragraph>
    <Paragraph position="1"> Yet as Norvig notes in passing, using his approach the resulting parsers in general fail to terminate on left-recursive grammars, even with memoization. The goal of this paper is to discover why this is the case and present a functional formalization of memoized top-down parsing for which this is not so. Specifically, I show how to formulate top-down parsers in a 'continuation-passing style,' which incrementally enumerates the right string positions of a category, rather than returning a set of such positions as a single value. This permits a type of memoization not described to my knowledge in the context of functional programming before. This kind of memoization is akin to that used in logic programming, and yields terminating parsers even in the face of left recursion.</Paragraph>
    <Paragraph position="2"> In this paper, algorithms are expressed in the Scheme programming language (Rees and Clinger 1991). Scheme was chosen because it is a popular, widely known language that many readers find easy to understand. Scheme's 'first-class' treatment of functions simplifies the functional abstraction used in this paper, but the basic approach can be implemented in more conventional languages as well. Admittedly elegance is a matter of taste, but personally I find the functional specification of CFGs described here as simple and elegant as the more widely known logical (DCG) formalization, and I hope that the presentation of working code will encourage readers to experiment with the ideas described here and in more substantial works such as Leermakers (1993). In fact, my own observations suggest that with minor modifications (such as the use of integers rather than lists to indicate string positions, and vectors indexed by string positions rather than lists in the memoization routines) an extremely efficient chart parser can be obtained from the code presented here.</Paragraph>
    <Paragraph position="3"> Ideas related to the ones discussed here have been presented on numerous occasions. Almost 20 years ago Shiel (1976) noticed the relationship between chart parsing and top-down parsing. Leermakers (1993) presents a more abstract discussion of the functional treatment of parsing, and avoids the left-recursion problem for memoized  * Cognitive Science Department, Brown University, Box 1978, Providence, RI 02912 (~) 1995 Association for Computational Linguistics  Computational Linguistics Volume 21, Number 3 functional parsers by using a 'recursive ascent' or PLR parsing strategy instead of a top-down strategy. At a more abstract level than that of this paper, Shieber, Schabes, and Pereira (1994) show that a variety of well-known parsing algorithms can be viewed as computing the closure of a set of basic parsing operations on a representation of the input string.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML