File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/c94-2199_intro.xml
Size: 4,622 bytes
Last Modified: 2025-10-06 14:05:40
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2199"> <Title>A Simple Transformation for Oflqine-Parsable Grammars and its Termination Properties</Title> <Section position="4" start_page="1227" end_page="1227" type="intro"> <SectionTitle> 3 Left-recursion elimination </SectionTitle> <Paragraph position="0"> The transformation can be logically divided into two steps: (1) an encoding of DCG into a &quot;generic&quot; form DCG', and (2) a simple replacement of a certain group of left-recursive rules in DCG' by a certain equivalent non left-recursive group of rules, yielding a top-down interpretable DCG&quot;. An example of the transformation DCG ----+ DCG' ----+ DCG&quot; is given in fig. 2.</Paragraph> <Paragraph position="1"> The encoding is performed by the following algorithm: null input: an oittine-parsable DCG without empty rules.</Paragraph> <Paragraph position="2"> output: an equivalent &quot;encoding&quot; DCG'.</Paragraph> <Paragraph position="3"> algorithm: initialize LIST to a list of the rules of DCG.</Paragraph> <Paragraph position="4"> initialize DCG' to the list of rules (literally): g(X) --~ g(Y), d(Y, X).</Paragraph> <Paragraph position="5"> g(x) -, t(x).</Paragraph> <Paragraph position="6"> while there exists a rule R of the form A(T1 ..... Tk) --, B(S1 ..... Sl) a in LIST do: remove R from LIST.</Paragraph> <Paragraph position="7"> add to DCG' a rule R': d(B(PS'l ..... Sl), A(T1 ..... Tk)) --+ ~', where c~ ~ is obtained by replacing any C(V1, ..., Vm) in a by g(C(V1, ..., Vm)), or is set to \[ \] in the case where oe is empty. while there exists a rule R of the form A(TI ..... Tk) -+ \[terminal\] ~ in LIST do: remove R from LIST.</Paragraph> <Paragraph position="8"> add to DCG' a rule R': t(A(T1 ..... TI~)) -. \[terminal\] #, where cJ is obtained by replacing any C(V1, ..., Vm) in ~ by g(6'(V1 ..... Vm)), or is set to \[ \] in the ease where c~ is empty. ~fhe procedure is very simple. It involves the creation of a generic nonterminal g(X), of arity one, which performs a task equivalent to the original nonterminals s(X1,...,Xn),vp(X1,...,Xra),.... The goal g(s(X1,..., Xn)), for instance, plays the same role for parsing a sentenee as did the goal s(X1,...,Xn) in the original grammar.</Paragraph> <Paragraph position="9"> Two further generic nonterminals are introduced: fiX) accounts for rules whose right-hand side begins with a terminal, while d(Y, X) accounts for rules whose right-hand side begins with a nonterminal. The rationale behind the encoding is best understood fi'mn the following examples, where ~ represents rule rewriting: null</Paragraph> <Paragraph position="11"> The second example illustrates the role played by d(Y, X) in the encoding. This nonterminal has the following interpretation: X is an&quot;immediate&quot; extension of Y using the given rule. In other words, Y corresponds to an &quot;immediate left-corner&quot; of X.</Paragraph> <Paragraph position="12"> The left-recnrsion elimination is now performed by the following &quot;algorithm&quot; :9 intmt: a DCG' encoded as above.</Paragraph> <Paragraph position="13"> output: an equivalent non left-recursive DCG&quot;.</Paragraph> <Paragraph position="14"> algorithm: initialize DCG&quot; to DCG'.</Paragraph> <Paragraph position="15"> in DCG&quot;, replace literally the rules:</Paragraph> <Paragraph position="17"> In this transformation, the new nonterminal d_tc plays the role of a kind of transitive closure of d. It can be seen that, relative to DCG&quot;, for any string w and for any ground term z, the fact that .q(z) rewrites into w --or, equivalently, that there exists a ground term x such that t(x) d_tc(x,z) rewrites into w-is equivalent to the existence of a sequence of ground terms x = xl, ..., xa = z and asequence of strings wl, ..., wk such that t(xl) rewrites into wi, d(xl, x2) rewrites into w;, ..., d(xk-1, xk) rewrites into we, and such that w is the string concatenation w = wl &quot;&quot;wk. From our previous remark on the meaning of d(Y, X), this can be interpreted as saying that &quot;consituent x is a left-corner of constituent z&quot;, relatively to string w. The grammar DCG&quot; can now be compiled in the standard way---via the adjunetion of two &quot;differential list&quot; arguments---into a Prolog program which can bc executed directly. If we started from an oflline-parsable grammar DCGO, this program will enumerate all solutions to the parsing problem and terminate after a finite number of steps. 1deg</Paragraph> </Section> class="xml-element"></Paper>