File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/p88-1033_metho.xml
Size: 22,467 bytes
Last Modified: 2025-10-06 14:12:13
<?xml version="1.0" standalone="yes"?> <Paper uid="P88-1033"> <Title>A DEFINITE CLAUSE VERSION OF CATEGORIAL GRAMMAR</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> A DEFINITE CLAUSE VERSION OF CATEGORIAL GRAMMAR </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> ABSTRACT </SectionTitle> <Paragraph position="0"> We introduce a first-order version of Categorial Grammar, based on the idea of encoding syntactic types as definite clauses. Thus, we drop all explicit requirements of adjacency between combinable constituents, and we capture word-order constraints simply by allowing subformulae of complex types to share variables ranging over string positions. We are in this way able to account for constructiods involving discontinuous constituents. Such constructions axe difficult to handle in the more traditional version of Categorial Grammar, which is based on propositional types and on the requirement of strict string adjacency between combinable constituents.</Paragraph> <Paragraph position="1"> We show then how, for this formalism, parsing can be efficiently implemented as theorem proving.</Paragraph> <Paragraph position="2"> Our approach to encoding types:as definite clauses presupposes a modification of standard Horn logic syntax to allow internal implications in definite clauses. This modification is needed to account for the types of higher-order functions and, as a consequence, standard Prolog-like Horn logic theorem proving is not powerful enough. We tackle this * I am indebted to Dale Miller for help and advice. I am also grateful to Aravind Joshi, Mark Steedman, David x, Veir, Bob Frank, Mitch Marcus and Yves Schabes for comments and discussions. Thanks are due to Elsa Grunter and Amy Feh.y for advice on typesetting. Parts of this research were supported by: a Sloan foundation grant to the Cognitive Science Program, Univ. of Pennsylvania; and NSF grants MCS-8219196-GER, IRI-10413 AO2, ARO grants DAA29-84-K-0061, DAA29-84-9-0027 and DARPA grant NOOO14-85-K0018 to CIS, Univ. of Pezmsylvani& t Address for correspondence problem by adopting an intuitionistic treatment of implication, which has already been proposed elsewhere as an extension of Prolog for implementing hypothetical reasoning and modular logic programming. null</Paragraph> </Section> <Section position="3" start_page="0" end_page="270" type="metho"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Classical Categorial Grammar (CG) \[1\] is an approach to natural language syntax where all linguistic information is encoded in the lexicon, via the assignment of syntactic types to lexical items.</Paragraph> <Paragraph position="1"> Such syntactic types can be viewed as expressions of an implicational calculus of propositions, where atomic propositions correspond to atomic types, and implicational propositions account for complex types. A string is grammatical if and only if its syntactic type can be logically derived from the types of its words, assuming certain inference rules.</Paragraph> <Paragraph position="2"> In classical CG, a common way of encoding word-order constraints is by having two symmetric forms of &quot;directional&quot; implication, usually indicated with the forward slash / and the backward slash \, constraining the antecedent of a complex type to be, respectively, right- or left-adjacent. A word, or a string of words, associated with a right(left-) oriented type can then be thought of as a right- (left-) oriented function looking for an argument of the type specified in the antecedent. A convention more or less generally followed by linguists working in CG is to have the antecedent and the consequent of an implication respectively on the right and on tile left of the connective. Thus, tile type-assignment (1) says that the ditransitive verb put is a function taking a right-adjacent argulnent of type NP, to return a function taking a right-adjacent argument of type PP, to return a function taking a left-adjacent argument of type NP, to finally return an expression of the atomic type S.</Paragraph> <Paragraph position="3"> (1) put: ((b~xNP)/PP)/NP The Definite Clause Grammar (DCG) framework \[14\] (see also \[13\]), where phrase-structure grammars can be encoded as sets of definite clauses (which are themselves a subset of Horn clauses), and the formalization of some aspects of it in \[15\], suggests a more expressive alternative to encode word-order constraints in CG. Such an alternative eliminates all notions of directionality from the logical connectives, and any explicit requirement of adjacency between functions and arguments, and replaces propositions with first-order * formulae. Thus, atomic types are viewed as atomic formulae obtained from two-place predicates over string positions represented as integers, the first and the second argument corresponding, respectively, to the left and right end of a given string. Therefore, the set of all sentences of length j generated from a certain lexicon corresponds to the type S(0,j). Constraints over the order of constituents are enforced by sharing integer indices across subformulae inside complex (functional) types.</Paragraph> <Paragraph position="4"> This first-order version of CG can be viewed as a logical reconstruction of some of the ideas behind the recent trend of Categorial Unification Grammars \[5, 18, 20\] 1. A strongly analogous development characterizes the systems of type-assignment for the formal languages of Combinatory Logic and Lambda Calculus, leading from propositional type systems to the &quot;formulae-as-types&quot; slogan which is behind the current research in type theory \[2\]. In this paper, we show how syntactic types can be encoded using an extended version of standard Horn logic syntax.</Paragraph> </Section> <Section position="4" start_page="270" end_page="270" type="metho"> <SectionTitle> 2 Definite Clauses with In- </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="270" end_page="270" type="sub_section"> <SectionTitle> ternal Implications </SectionTitle> <Paragraph position="0"> Let A and ---* be logical connectives for conjunction and implication, and let V and 3 be the univer1 Indeed, Uszkoreit \[18\] mentions the possibility of encoding order constraints among constituents via variables ranging over string positions in the DCG style.</Paragraph> <Paragraph position="1"> sal and existential quantifiers. Let A be a syntactic variable ranging over the set of atoms, i. e. the set of atomic first-order formulae, and let D and G be syntactic variables ranging, respectively, over the set of definite clauses and the set of goal clauses.</Paragraph> <Paragraph position="2"> We introduce the notions of definite clause and of goal clause via the two following mutually recursive definitions for the corresponding syntactic variables D and G:</Paragraph> <Paragraph position="4"> We call ground a clause not containing variables.</Paragraph> <Paragraph position="5"> We refer to the part of a non-atomic definite clause coming on the left of the implication connective as to the body of the clause, and to the one on the right as to the head. With respect to standard Horn logic syntax, the main novelty in the definitions above is that we permit implications in goals and in the bodies of definite clauses. Extended Horn logic syntax of this kind has been proposed to implement hypothetical reasoning \[3\] and modules \[7\] in logic programming. We shall first make clear the use of this extension for the purpose of linguistic description, and we shall then illustrate its operational meaning.</Paragraph> </Section> </Section> <Section position="5" start_page="270" end_page="272" type="metho"> <SectionTitle> 3 First-order </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="270" end_page="271" type="sub_section"> <SectionTitle> Categorial Grammar 3.1 Definite Clauses as Types </SectionTitle> <Paragraph position="0"> We take CONN (for &quot;connects&quot;) to be a three-place predicate defined over lexical items and pairs of integers, such that CONN(item, i,j) holds if and only if and only if i = j - 1, with the intuitive meaning that item lies between the two consecutive string positions i and j. Then, a most direct way to translate in first-order logic the type-assignment (1) is by the type-assignment (2), where, in the formula corresponding to the assigned type, the non-directional implication con- null nective --, replaces the slashes.</Paragraph> <Paragraph position="1"> (2) put : VzVyYzVw\[CONN(put, y - 1, y) --*</Paragraph> <Paragraph position="3"> A definite clause equivalent of tile formula in (2) is given by the type-assignment (3) 2 .</Paragraph> <Paragraph position="4"> (3) put: VzVyVzVw\[CONN(put, y -- 1, y) A</Paragraph> <Paragraph position="6"> Observe that the predicate CONNwill need also to be part of types assigned to &quot;non-functional&quot; lexical items. For example, we can have for the noun-phrase Mary the type-assignment (4).</Paragraph> <Paragraph position="7"> (4) Mary : Vy\[OONN(Mary, y- 1,y) .-.-* NP(y - 1, y)\]</Paragraph> </Section> <Section position="2" start_page="271" end_page="271" type="sub_section"> <SectionTitle> 3.2 Higher-order Types and Inter- nal Implications </SectionTitle> <Paragraph position="0"> Propositional CG makes crucial use of functions of higher-order type. For example, the type-assignment (5) makes the relative pronoun which into a function taking a right-oriented function from noun-phrases to sentences and returning a relative clause 3. This kind of type-assignment has been used by several linguists to provide attractive accounts of certain cases of extraction \[16, 17, 10\].</Paragraph> <Paragraph position="1"> (5) which: REL/(S/NP) In our definite clause version of CG, a similar assignment, exemplified by (6), is possible, since * implications are allowed in the. body of clauses.</Paragraph> <Paragraph position="2"> Notice that in (6) the noun-phrase needed to fill the extraction site is &quot;virtual&quot;, having null length. (6) which: VvVy\[CONN(which, v - 1, v) ^</Paragraph> <Paragraph position="4"> 2 See \[2\] for a pleasant formal characterization of first-order definite clauses as type declarations.</Paragraph> <Paragraph position="5"> aFor simplicity sake, we treat here relative clauses as constituents of atomic type. But in reality relative clauses are noun modifiers, that is, functions from nouns to nouns.</Paragraph> <Paragraph position="6"> Therefore, the propositional and the first-order atomic type for relative clauses in the examples below should be thought of as shorthands for corresponding complex types.</Paragraph> </Section> <Section position="3" start_page="271" end_page="272" type="sub_section"> <SectionTitle> 3.3 Arithmetic Predicates </SectionTitle> <Paragraph position="0"> The fact that we quantify over integers allows us to use arithmetic predicates to determine sub-sets of indices over which certain variables must range. This use of arithmetic predicates characterizes also Rounds' ILFP notation \[15\], which appears in many ways interestingly related to the framework proposed here. We show here below how this capability can be exploited to account for a case of extraction which is particularly problematic for bidirectional propositional CG.</Paragraph> <Paragraph position="1"> Both the propositional type (5) and the first-order type (6) are good enough to describe the kind of constituent needed by a relative pronoun in the following right-oriented case of peripheral extraction, where the extraction site is located at one end of the sentence. (We indicate the extraction site with an upward-looking arrow.) which \[Ishallput a book on T \] However, a case of non.peripheral extraction, where the extraction site is in the middle, such as which \[ I shall put T on the table \] is difficult to describe in bidirectional propositional CG, where all functions must take left- or right-adjacent arguments. For instance, a solution like the one proposed in \[17\] involves permuting the arguments of a given function. Such an operation needs to be rather cumbersomely constrained in an explicit way to cases of extraction, lest it should wildly overgenerate. Another solution, proposed in \[10\], is also cumbersome and counterintuitive, in that involves the assignment of multiple types to wh-expressions, one for each site where extraction can take place.</Paragraph> <Paragraph position="2"> On the other hand, the greater expressive power of first-order logic allows us to elegantly generalize the type-assignment (6) to the type-assignment (7). In fact, in (7) the variable identifying the extraction site ranges over the set of integers in between the indices corresponding, respectively, to the left and right end of the sentence on which the rdlative pronoun operates. Therefore, such a sentence can have an extraction site anywhere between its string boundaries.</Paragraph> <Paragraph position="4"> Non-peripheral extraction is but one example of a class of discontinuous constituents, that is, constituents where the function-argument relation is not determined in terms of left- or right-adjacency, since they have two or more parts disconnected by intervening lexical material, or by internal extraction sites. Extraposition phenomena, gapping constructions in coordinate structures, and the distribution of adverbials offer other problematic examples of English discontinuous constructions for which this first-order framework seems to promise well. A much larger batch of similar phenomena is offered by languages with freer word order than English, for which, as pointed out in \[5, 18\], classical CG suffers from an even clearer lack of expressive power. Indeed, Joshi \[4\] proposes within the TAG framework an attractive general solution to word-order variations phenomena in terms of linear precedence relations among constituents. Such a solution suggests a similar approach for further work to be pursued within the framework presented here.</Paragraph> </Section> </Section> <Section position="6" start_page="272" end_page="274" type="metho"> <SectionTitle> 4 Theorem Proving </SectionTitle> <Paragraph position="0"> In propositional CG, the problem of determining the type of a string from the types of its words has been addressed either by defining certain &quot;combinatory&quot; rules which then determine a rewrite relation between sequences of types, or by viewing the type of a string as a logical consequence of the types of its words. The first alternative has been explored mainly in Combinatory Grammar \[16, 17\], where, beside the rewrite rule of functional application, which was already in the initial formulation of CG in \[1\], there are also tim rules of functional composition and type raising, which are used to account for extraction and coordination phenomena. This approach offers a psychologically attractive model of parsing, based on the idea of incremental processing, but causes &quot;spurious ambiguity&quot;, that is, an almost exponential proliferation of the possible derivation paths for identical analyses of a given string. In fact, although a rule like functional composition is specifically needed for cases of extraction and coordination, in principle nothing prevents its use to analyze strings not characterized by such phenomena, which would be analyzable in terms of functional application alone. Tentative solutions of this problem have been recently discussed in \[12, 19\].</Paragraph> <Paragraph position="1"> The second alternative has been undertaken in the late fifties by Lambek \[6\] who defined a decision procedure for bidirectional propositional CG in terms of a Gentzen-style sequent system. Lambek's implicational calculus of syntactic types has recently enjoyed renewed interest in the works of van Benthem, Moortgat and other scholars. This approach can account for a range of syntactic phenomena similar to that of Combinatory Grammar, and in fact many of the rewrite rules of Combinatory Grammar can be derived as theorems in the calculus, tIowever, analyses of cases of extraction and coordination are here obtained via inferences over the internal implications in the types of higher-order functio~ls. Thus, extraction and coordination can be handled in an expectation-driven fashion, and, as a consequence, there is no problem of spuriously ambiguous derivations.</Paragraph> <Paragraph position="2"> Our approach here is close in spirit to Lambek's enterprise, since we also make use of a Gentzen system capable of handling the internal implications in the types of higher-order functions, but at the same time differs radically from it, since we do not need to have a &quot;specialized&quot; propositional logic, with directional connectives and adjacency requirements. Indeed, the expressive power of standard first-order logic completely eliminates the need for this kind of specialization, and at the same time provides the ability to account for constructions which, as shown in section 3.3.1, are problematic for an (albeit specialized) propositional framework.</Paragraph> <Section position="1" start_page="272" end_page="273" type="sub_section"> <SectionTitle> 4.1 An Intuitionistic Exterision of Prolog </SectionTitle> <Paragraph position="0"> The inference system we are going to introduce below has been proposed in \[7\] as an extension of Prolog suitable for modular logic programming. A similar extension has been proposed in \[3\] to implement hypotethical reasoning in logic programming. We are thus dealing with what can be considered the specification of a general purpose logic programming language. The encoding of a particular linguistic formalism is but one other application of such a language, which Miller \[7\] shows to be sound and complete for intuitionistic logic, and to have a well defined semantics in terms of Kripke models.</Paragraph> <Paragraph position="1"> We take a logic program or, simply, a program 79 to be any set of definite clauses. We formally represent the fact that a goal clause G is logically derivable from a program P with a sequent of the form 79 =~ G, where 79 and G are, respectively, the antecedent and the succedent of the sequent. If 7 ~ is a program then we take its substitution closure \[79\] to be the smallest set such that</Paragraph> <Paragraph position="3"> where \[z/t\] denotes the result of substituting t for free occurrences of t in D We introduce now the following proof rules, which define the notion of proof for our logic programrning language:</Paragraph> <Paragraph position="5"> In the inference figures for rules (II) - (V), the sequent(s) appearing above the horizontal line are the upper sequent(s), while the sequent appearing below is the lower sequent. A proof for a sequent 7 ) =~ G is a tree whose nodes are labeled with sequents such that (i) the root node is labeled with 7 9 ~ G, (ii) the internal nodes are instances of one of proof rules (II) - (V) and (iii) the leaf nodes are labeled with sequents representing proof rule (I).</Paragraph> <Paragraph position="6"> The height of a proof is the length of the longest path from the root to some leaf. The size of a proof is the number of nodes in it.</Paragraph> <Paragraph position="7"> Thus, proof rules (I)-(V) provide the abstract specification of a first-order theorem prover which can then be implemented in terms of depth-first search, backtracking and unification like a Prolog interpreter. (An example of such an implementation, as a metainterpreter on top of Lambda-Prolog, is given in \[9\].) Observe however that an important difference of such a theorem prover from a standard Prolog interpreter is in the wider distribution of &quot;logical&quot; variables, which, in the logic programming tradition, stand for existentially quantified variables within goals. Such variables can get instantiated in the course of a Prolog proof, thus providing the procedural ability to return specific values as output of the computation.</Paragraph> <Paragraph position="8"> Logical variables play the same role in the programming language we are considering here; moreover, they can also occur in program clauses, since subformulae of goal clauses can be added to programs via proof rule (V).</Paragraph> </Section> <Section position="2" start_page="273" end_page="273" type="sub_section"> <SectionTitle> 4.2 How Strings Define Programs </SectionTitle> <Paragraph position="0"> Let a be a string a, ... an of words from a lexicon Z:. Then a defines a program 79a = ra tJ Aa such that</Paragraph> <Paragraph position="2"> Thus, Pa just contains ground atoms encoding the position of words in a. A a contains instead all the types assigned in the lexicon to words in a. We assume arithmetic operators for addition, subtraction, multiplication and integer division, and we assume that any program 79= works together with an infinite set of axioms ,4 defining the comparison predicates over ground arithmetic expressions <, _<, >, _>. (Prolog's evaluation mechanism treats arithmetic expressions in a similar way.) Then, under this approach a string a is of type Ga if and only if there is a proof for the sequent 7)aU.4 ::~ Ga according to rules (I) - (V).</Paragraph> </Section> <Section position="3" start_page="273" end_page="274" type="sub_section"> <SectionTitle> 4.3 An Example </SectionTitle> <Paragraph position="0"> We give here an example of a proof which determines a corresponding type-assignment. Consider the string whom John loves Such a sentence determines a program 79 with the following set F of ground atoms: { CONN(whom, O, I), CONN(John, I, 2), CONN(loves, 2, 3)} \,Ve assume lexical type assignments such that the remaining set of clauses A is as follows: {VxVz\[CONN(whom, x - 1, x) A</Paragraph> <Paragraph position="2"> s(x, z)l} The clause assigned to the relative pronoun whom corresponds to the type of a higher-order function, and contains an implication in its body. Figure 1 shows a proof tree for such a typeassignment. The tree, which is represented as growing up from its root, has size 11, and height 8.</Paragraph> </Section> </Section> class="xml-element"></Paper>