File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/94/c94-2199_abstr.xml

Size: 8,163 bytes

Last Modified: 2025-10-06 13:48:10

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-2199">
  <Title>A Simple Transformation for Oflqine-Parsable Grammars and its Termination Properties</Title>
  <Section position="2" start_page="0" end_page="1226" type="abstr">
    <SectionTitle>
1 Motivation
</SectionTitle>
    <Paragraph position="0"> Definite clause grammars (DCGs) are one of the simplest and most widely used unification grammar formalisms. They represent a direct augmentation of context-free grammars through the use of (term) unification (a fact that tends to be masked by their usual presentation based on the programming language Prolog). It is obviously important to ask wether certain usual methods and algorithms pertaining to CFGs can be adapted to DCGs, and this general question informs much of the work concerning I)CGs, as well as more complex unification grammar formalisms (to cite only a few areas: Earley parsing, LR parsing, left-corner parsing, Greibach Norinal l,'orm).</Paragraph>
    <Paragraph position="1"> One essential complication when trying to generalize CFG methods to the I)CG domain lies in the fact that, whereas the parsing problein for ClOGs is decidable, the corresponding problem for DCGs is in general andecidable. This can be shown easily as a consequence of the noteworthy fact that any definite clause program can be viewed as a definite clause grammar &amp;quot;on the empty string&amp;quot;, that is, as a DCG where no terminals other than \[ \] are allowed on the right-hand sides of rules. The ~Itlring-completeness of defnite clanse programs therefbre implies the undecidability of the parsing problem for this snbclass of DCGs, and a fortiori for DCGs in general. 1 In order to guarantee good *Thaalks to Pierre Isabelle and Frangols Perrault for their comments, and to C,\[TI (Montreal) for its support during the preparation of this paper.</Paragraph>
    <Paragraph position="2"> 1 I)CGs on I, he empty string might be dismissed as extreme, computationM properties for DCGs, it is then necessary to impose certain restrictions on their form such as o\[fline-parsability (OP), a nomenclature introduced by Pereira and Warren \[11\], who define an OP DCG as a grammar whose context-free skeleton CFG is not infinitely ambiguous, and show that OP DCGs lead to a decidable parsing problem. 2 Our aim in this paper is to propose a simple transformation lbr an arbitrary OP DCG putting it into a form which leads to the completeness of the direct top-down interpretation by the standard Prolog interpreter: parsing is guaranteed to enumerate all solutions to the parsing problem and terminate. The e.xistence of such a transformation is kuown: in \[1, 2\], we have recently introduced a &amp;quot;Generalized Greibach Normal Form&amp;quot; (GGNF) for DCGs, which leads to termination of top-down interpretation in the OP case. lIowever, the awdlable presentation of the GGNF transformation is rather complex (it involves an algebraic study of the fixpoints of certain equational systems representing grammars.). Our aim here is to present a related, but much simpler, transformation, which from a theoretical viewpoint performs somewhat less than the GGNF transformation (it; involves some encoding of the initial DCG, which the (~GNF does not, and it only handles oflline-parsable grammars, while the GGNF is defined for arbitrary DCGs), a but in practice is extremely easy to implement and displays a comparable behavior when parsing with an OP grammar.</Paragraph>
    <Paragraph position="3"> 3'he transformation consists of two steps: (1) empty-production elimination and (2) left-recursion elimination. null The empty-production elimination Mgorithm is inspired by the nsnal procedure for context-free grammars. But there are some notable differences, due to the fact that removal of empty-productions is in general impossible for non-OP I)CGs. The emptybut they are in fact at the core of the oflline-parsability concept. See note 3.</Paragraph>
    <Paragraph position="4"> 2'lThe concept of ofllineA~arsability (under a different name) goes back to \[8\], where it is shown to be linguistically relevant. aThe GGNF factorizes an arbitrary DCG into two components: a &amp;quot;unit sub-DCG on the empty string&amp;quot;, and another paa't consisting of rules whose right-hand side starts with a tm'minal. The decidability of the DCG depends exclusively on certain simple textual properties of the unit sub-DCG. This sub-l)CG can be eliminated fl'om the GGNF if and only if the DCG is of Illne-parsable.</Paragraph>
    <Paragraph position="5">  production elimination a(gorithm is guaranteed to terminate only in the OP ease. 't It produces a I)C(\] declaratively equivalent to the. original grammar.</Paragraph>
    <Paragraph position="6"> The left-recursion elimination ~dgorithnt is adapted from a transR)rmation proposed in \[4\] in the context of a certain formalism (&amp;quot;l,exical Grammars&amp;quot;) which we presented as a possible basis for bui(ding reversible grammars, a The key observation (in slightly different terms) was that, in a I)CG, ifa nontermiual g is defined (itcrMly by the two rules (the first of which is leftreeursive): null :+\') --+ g(Y), a(v, x).</Paragraph>
    <Paragraph position="7"> ,(x) --, ~(x).</Paragraph>
    <Paragraph position="8"> then the replacement of these two rules by the three rules (where d_tc is a new nonterminal symbol, which represents a kind of &amp;quot;transitive c(osure&amp;quot; of d):</Paragraph>
    <Paragraph position="10"> l)reserves the declarative semantics o1' tim grammar, s We remarked in \[4\] that this transformation :'is closely re(ated to le('t&lt;.orner pa.rsing&amp;quot;, but did not give details. In a recent paper \[7\], Mark Johnson introduces &amp;quot;a left-corner program transR)rmation for natural (anguage parsing&amp;quot;, which has some similarity to the abow~ transformation, but whic.h is applied to definite clause programs, rather than to ()CGs. lie proves that this transformation respects deelarative equivalcnee, and also shows, using a mode(q;heoretic approach, the close connection of his transformation with (eft-corner parsing \[12, 9, 1()\]. r (t 1TlUSt be noted that the left-reeursion elimination procedure can 1)e a*pplied to any \])C(~, whether OP or not. Even in the case where the grammar is OP, how ever, it wil( not (ead to a terminating parsing algorithm unless empty l)roductions have been prea(ably eliminated from the grammar, a l)roblem wlfirh is shared by the usual left-corner parser-interpreter.</Paragraph>
    <Paragraph position="11"> 4'Fhe fact that the standard (','FG emptyq)roduction elinfio nation transformation is always possible is relal.ed to the fact that this transformation does not preserve degrees of ambiguity. For instance the infinitely ambiguous grammar S ~ \[b\] A, A  A, A -+ \[ \] is simplified into the grammar S -+ \[b\]. This type of simplification is generally impossible in a I)UG. Consider for insl ....... tim &amp;quot;g,' ........... &amp;quot;' s( X ) -~ \[ ....... be,'\] a( X ), a( ....... ( X ) ) --+ a(x), ~40) -+ \[\].</Paragraph>
    <Paragraph position="12">  for the original and the transformed grammar, is giwm in \[2\]. ?His paper does not state termination conditions for the transformed program. Such ternfination conditions w(mM probably involve some generalized notion of o\[ttine-parsability \[6, 5, 13\]. By contrast, we prove termlnation only for I)CGs which arc OP in the original sense of Pereira and Warren, but this ca.se SeelllS to llS tO represent llltlch of the core issue, &amp;lid Lo lead to some direct exl.ensions. \],'or instance, the I)CG transformation proposed here can I)e directly applied to &amp;quot;guided&amp;quot; programs in the sense of \[4\].</Paragraph>
    <Paragraph position="13"> Dae to the space available, we do not give here col rectncss proof~ Jbr the algorithms presented, but ez'peet to publish them in a tidier version of this paper. These algorithms have actually been implemented in a slightly extended version, where the*,/ are also used to decide whether the grammar proposed for&amp;quot; transformation is in fact oJfline-parsable or not.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML