File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/e91-1035_metho.xml
Size: 23,188 bytes
Last Modified: 2025-10-06 14:12:36
<?xml version="1.0" standalone="yes"?> <Paper uid="E91-1035"> <Title>PROOF FIGURES AND STRUCTURAL OPERATORS FOR CATEGORIAL GRAMMAR&quot;</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> PROOF FIGURES AND STRUCTURAL OPERATORS FOR CATEGORIAL GRAMMAR&quot; </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Buccleuch Place, Edinburgh EBB 9LW, Scotland </SectionTitle> <Paragraph position="0"> guy@cogsci, ed. ac.uk, arh@cl, cam. ac. uk, neil@cogs c i. ed. ac. uk, Glyn. Norrill@let. ruu. nl</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> ABSTRACT </SectionTitle> <Paragraph position="0"> Use of Lambek's (1958) categorial grammar for linguistic work has generally been rather limited. There appear to be two main reasons for this: the notations most commonly used can sometimes obscure the structure of proofs and fail to clearly convey linguistic structure, and the cMculus as it stands is apparently not powerful enough to describe many phenomena encountered in natural language.</Paragraph> <Paragraph position="1"> In this paper we suggest ways of dealing with both these deficiencies. Firstly, we reformulate Lambek's system using proof figures based on the 'natural deduction' notation commonly used for derivations in logic, and discuss some of the related proof-theory.</Paragraph> <Paragraph position="2"> Natural deduction is generally regarded as the most economical and comprehensible system for working on proofs by hand, and we suggest that the same advantages hold for a similar presentation of categorial derivations. Secondly, we introduce devices called structural modalities, based on the structural rules found in logic, for the characterization of commutation, iteration and optionality. This permits the description of linguistic phenomena which Lambek's system does not capture with the desired sensitivity and gencrallty.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> LAMBEK CATEGORIAL GRAMMAR PRELIMINARIES </SectionTitle> <Paragraph position="0"> Categorial grammar is an approach to language description in which the combination of expressions is governed not by specific linguistic rules but by general logical inference mechanisms. The point of departure can be seen as Frege's position that there are certain 'complete expressions' which are the primary bearers of meaning, and that the meanings of 'incomplete expressions' (including words) are derivative, being * We would like to thank Robin Cooper, Martin Pickering and Pete Whitelock for comments and discussion relating to this work. The authors were respectively supported by SERC Research Studentship 883069&quot;/1; ESRO Research Studentshlp C00428722003; ESPRIT Project 393 and Cognitive Science/HCI Research Initiative 89/CS01 and 89/CS25; SERC Postdoctoral Fellowship B/ITF/206.</Paragraph> <Paragraph position="1"> I Now at University of Cambridge Computer Laboratory, New Musctuns Sitc, Pembroke Street, Cambridge (31}2 3Q(;, Engl~u.l.</Paragraph> <Paragraph position="2"> 1 Now at OTS, '\]'rans 1O, 3512 JK Utrecht, Netherlands. their contribution to the meanings of the expressions in which they occur. We s.uppese that linguistic objects have (at least) two components, form (syntactic) and meaning (semantic). We refer to sets of such objects as categories, which axe indexed by types, and stipulate that all complete expressions belong to categories indexed by primitive types. We then recursively classify incomplete expressions according to the meems by which they combine (syntactically and semantically) with other expressions.</Paragraph> <Paragraph position="3"> In the 'syntactic calculus' of Lambek (1958) (variously known as Lambek categoriai grammar, Lambek calculus, or L), expressions are classified by means of a set of bidirectional types as defined in (1).</Paragraph> <Paragraph position="4"> (1) a. If X is a primitive type then X is a type.</Paragraph> <Paragraph position="5"> b. If X and Y are types then X/Y and Y\X are types.</Paragraph> <Paragraph position="6"> X/Y (resp. Y\X) is the type of incomplete expressions that syntactically combine with a following (resp. preceding) expression of type Y to form an expression of type X, and semantically are functions from meanings of type Y to meanings of type X.</Paragraph> <Paragraph position="7"> Let us assume complete expressions to be sentences (indexed by the primitive type S), noun phrases (NP), common nouns (N), and non-finite verb phrases (VP). By the above definitions, we may assign types to words as follows: (2) John, Mary, Suzy := NP</Paragraph> <Paragraph position="9"> We represent the form of a word by printing it in italics, and its meaning by the same word in boldface.</Paragraph> <Paragraph position="10"> For instance, the form of the word &quot;man ~ will be represented as man and its meaning as man.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> PROOF FIGURES </SectionTitle> <Paragraph position="0"> We shall present the rules of L by means of proo~ fi~res, based on Prawitz' (1965) systems of 'natural deduction'. Natural deduction was developed by Gentzen (1936) to reflect the natural process of mathematical reasoning in which one uses a number of in/erence tulsa to justify a single proposition, the conclusion, on the basis of having justifications of a number of propositions, called assumptions. During - 198 a proof one may temporarily make a new assumption if one of the rules licenses the subsequent withdrawal of this assumption. The rule is said to discharge the assumption. The conclusion is said to depend on the undischarged assumptions, which are called the by.</Paragraph> <Paragraph position="1"> potheses of the proof.</Paragraph> <Paragraph position="2"> A proof is usually represented as a tree: with the assumptions as leaves and the conclusion at:the root. Finding a proof is then seen as the task of filling this tree in, and the inference rules as operations on the partially completed tree. One can write the inference rules out as such operations, but as these are rather unwieldy it is more usual to present the rules in a more compact form as operations from a set of subproofs (the premises) to a conclusion, as follows</Paragraph> <Paragraph position="4"> This states that a proof of Z can be obtained from proofs of X1 .... , Xm by discharging appropriate occurrences of assumptions Y, ..... I/,. The use of square brackets around an assumption indicates its discharge. R is the name of the rule, and the index i is included to disambiguate proofs, since there may be an uncertainty as to which rule has discharged which assumption.</Paragraph> <Paragraph position="5"> As propositions are represented by formulas in logic, so linguistic categories are represented by type formulas in L. The left-to-right order of types indicates tim order in which the forms of subexpressions are to be concatenated to give a composite expression derived by the proof. Thus we must take note of the order and place of occurrence of the premises of the rules in the proof figures for L. There is also a problem with the presentation of the rules in the compact notation as some of the rules will be written us if they had a number of conclusions, as follows:</Paragraph> <Paragraph position="7"> This rule should be seen as a shorthand for:</Paragraph> <Paragraph position="9"> If the rules are viewed in this way it will be seen that they do not violate the single conclusion nature of the figures.</Paragraph> <Paragraph position="10"> As with standard natural deduction, for each connective there is an elimination rule which gates how a type containing that connective may be consumed, and an introduction rule which states how a type conraining that connective may be derived. The elimination rule for / states that a proof of type X/Y followed by a proof of type Y yields a proof of type X. Similarly the elimination rule for \ states that a proof of type Y\X preceded by a proof of type Y yields a proof of: type X. Using the notation above, we may write these rules as follows: : : b. i:. . (6) a. x)r., i'le v\x\~</Paragraph> <Paragraph position="12"> We shall give a semantics for this calculus in the same style as the traditional functional semantics for intuitionistic logic (Troelstra 1969; Howard 1980). In the two rules above, the meaning of the composite expression (of type X) is given by the functional application of the meaning Of the functor expression (i.e. the one of type X/Y or Y\X) to the meaning of the argument expression (i.e. the one of type Y). We represent function application :by juxtaposition, so that likes John means likes applied to John.</Paragraph> <Paragraph position="13"> Using the rules \[E and \E, we may derive &quot;Mary likes John&quot; as a sentence as follows: The meaning of the sentence is read off the proof by interpreting the/E and \E inferences as function application, giving the following: (8) (likes John) Mary The introduction rule for / states that where the rightmost a~sumption in a proof of the type X is of type Y, that assumption may be discharged to give a proof of the type X/Y. Similarly, the introduction rule for \ states that where the leftmost assumption in a proof of the type X is of type Y, that assumption may be discharged to give a proof of the type Y\X.</Paragraph> <Paragraph position="14"> Using the notation above, we may write these rules as follows: (9) a. ~\]' b. \[v.\]' .~1I, v\x \X ~ Note however that this notation does not embody the conditions that ihave been stated, namely that in/I Y is the rightmost undischarged assumption in the proof of X, and:in \I Y is the leftmost undischarged assumption in the proof of X. In addition, L carries the condition that in both/I and \I the sole assumption in a proof cannot be withdrawn, so that no types are assigned to the empty string.</Paragraph> <Paragraph position="15"> In the introduction rules, the meaning of the result is given by lambd&-abstraction over the meaning of the discharged assumption, which can be represented by a variable of the appropriate type. The relationship between lambda-abstraction and function application is given by the law of t-equality in (10), - 199 where c~\[/~lV \] means '~ with//substituted for #'. (See llindley and Seldin 1986 for a full exposition of the</Paragraph> <Paragraph position="17"> Since exactly one assumption must be withdrawn, the resulting lambda-terms have the property that each binder binds exactly one variable occurrence; we refer to this as the 'single-bind' property (van Benthem 1983). The rules in (9) are analogous to the usual natural deduction rule of conditionalization, except that the latter allows withdrawal of any number of assumptions, in any position.</Paragraph> <Paragraph position="18"> The \]I and \l rules are commonly used in constructions that are assumed in other theories to involve 'empty categories', such as (11): (11) (John is the man) who Mary likes.</Paragraph> <Paragraph position="19"> We assume that the relative clause modifies the noun &quot;man&quot; and hence should receive the type N\N. The string &quot;Mary likes&quot; can be derived as of type S/NP, and so assignment of the type (N\N)/(S/NP) to the object relative pronoun &quot;who&quot; allows the analysis in (12) (cf. Ades and Steedma n 1982): (12) who Mary likes</Paragraph> <Paragraph position="21"> The meaning of the string can be read off the proof by interpreting /I and \I as lambda-abstraction, giving the term in (13): (is) who (Ax\[(likes ~) Mary\]) Note that this mechanism is only powerful enough to allow constructions where the extraction site is clause-peripheral; for non-peripheyad extraction (and multiple extraction) we appear to need an extended logic, as described later.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> DERIVATIONAL EQUIVALENCE AND NORMAL FORMS </SectionTitle> <Paragraph position="0"> In the above system it is possible to give more than one proof for a single reading of a string. For exampie, corot)are the derivation of &quot;Mary likes John&quot; in (7), and the corresponding lambda-term in (8), with the derivation in (14) and the iambda-term in (15): By the definition in (10), the terms in (8) and (15) are //-equal, and thus have the same meaning; the proofs in (7) and (14) are said to exhibit derivationai equivalence. The relation of derivational equivalence clearly divides the set of proofs into equivalence classes. We shall define a notion of normal form for proofs (and their corresponding terms) in such a way that each equivalence class of proofs contains a unique normal form (cf. Hepple mad Morrill 1989).</Paragraph> <Paragraph position="1"> We first define the notions of contraction and re.due.</Paragraph> <Paragraph position="2"> tion. A contraction schema R D C consists of a particular pattern R within proofs or terms (the redez) and an equal and simpler pattern C (the contractum).</Paragraph> <Paragraph position="3"> A reduction consists of a series of contractions, each replacing an occurrence of a redex by its contractum.</Paragraph> <Paragraph position="4"> A normal form is then a proof or term on which no contractions are possible.</Paragraph> <Paragraph position="5"> We define the following contraction schemas: weak contraction in (16) for proofs, and t-contraction in (17) for the corresponding lambda-terms.</Paragraph> <Paragraph position="6"> (16) ~. V.\]'</Paragraph> <Paragraph position="8"> From (10) we see that t-contraction preserves meaning according to the standard functional interpretation of typed lambda.calculus. Therefore the corresponding weak contraction preserves the semantic functional interpretation of the proof; in addition it preserves the syntactic string interpretation since the redex and contractum contain the same leaves in the same order. For example, the proof in (14) weakly contracts to theproof in (7), and correspondingly the term in (15) //-contracts to the term in (8). The results of these contractions cannot be further contracted and so ~re the respective results of reduction to weak normal form and//.normal form.</Paragraph> <Paragraph position="9"> Weak contraction in L strictly decreases the size of proofs (e.g. the number of symbols in a contractum is always less than that in a redex), and//-contraction in. the single-bind lambda-calculus strictly decreases the size of terms. Thus there is strong normalization with respect to these reductions: every proof (term) reduces to a weak normal form (//-normal form) in a finite number of steps. This has as a corollary (normalization) that every proof (term) has a normad form, so that normal forms are fully representative: every proof (term) is equal to one in normal form. Since reductions preserve interpretations, an interpretation of a normal form will always be the - 200 same as that of the original proof (term). Thus restricting the search to just such proofs addresses the problem of derivational equivalence, while preserving generality in that all interpretations are found* Proofs in L and singie-bind lambda-terms (like the more general cases of intuitionistic proofs and full lambda-terms) exhibit a property called the Church-Itosser property) from which it follows that normal forms are unique. 2 For formulations of L that are oriented to parsing, defining normal forms for proofs provides a basis for handling the so-called 'spurious ambiguity' problem, by providing for parsing methods which return all aml only normal form proofs. See KSnig (1989) ~t,,d lh:pl,lc (1990).</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> STRUCTURAL MODALITIES </SectionTitle> <Paragraph position="0"> From a logical perspective, L can be seen as the weakest of a hierarchy of implicational sequent logics which differ in the amount of freedom allowed in' the use of assumptions. The higl,est of these is (the implicational fragment of) the logistic calculus LJ introduced in Gentzen (1936). Gentzen formulated this calculus ia terms of sequences of propositions, and then provided explicit structural rules to show the permitted ways to manipulate these sequences. The structural rules are permutation, which allows the order of the assumptions to be changed; contraction, which allows an assumption to be used more than once; and toeakening, which allows an assumption to be ignored. For a discussion of the logics generated by dropping some or all of these structural rules see e.g. van Benthem (1987).</Paragraph> <Paragraph position="1"> Although freely applying structural rules 'are clearly not appropriate in categorial grammars for linguistic description, commutable, iterable and optional eleme,ts do occur in natural language. This suggests that we should have a way to indicate that structural operatiops are permissible on specific types, while still forbidding tl,eir general application. To achieve this we propose to follow the precedent of the e~ponenlial operators of Girard's (1987) linear sequent logic, which lacks the rules of contraction and weakening, by s.ggesting a similar system of operators called structnral sodalities, tiers we shall describe a system of universal sodalities, which allow us to deal with the logic of commutable, iterable and: optional extractions, a For each universal sodality we shall present an elimination rule, and one or more 'operational rules', whicl, are essentially controlled versions of structural 1 This is the property that if a proof (term) M reduces to two proofs (terms) NI, N2, then there is a proof (term) to wlfich both NI and N2 reduce.</Paragraph> <Paragraph position="2"> 2The above remarks also extend to a second form of reduction, strong reduction/11-reduction, which we have not space to describe here. See Morrill et aL (1990).</Paragraph> <Paragraph position="3"> aThe name is dmseJa because the elimination and introduct;on rules appropriate to each operator turn out to be those for the unlvcrsal ,nodality in the \]nodal logic $4. See Dosen (1990), rules. (Introduction rules can also be defined, but we omit these here for brevity and because they axe not required for the linguistic applications we discuss.) Note that these operators are strictly formal devices and not geared towards specific linguistic phenomena. Their use fat the applications described, which are suggested purely for illustration, may lead to over-generation in some cases. 4</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> COMMUTATION </SectionTitle> <Paragraph position="0"> The type AX is assigned to an item of type X which may be freely permuted. A hu the following inference rules:</Paragraph> <Paragraph position="2"> From these rules we see that an occurrence of an item of type X in any position may be derived from an item of type AX.</Paragraph> <Paragraph position="3"> We may use this operator in a treatment of rein. tivization that will allow not only peripheral extraction as in (198), but also non-peripheral extraction as in (19b): (19) a. (Here is the paper) which Snzy read.</Paragraph> <Paragraph position="4"> b. (Here is the paper) which Suzy read quickly.</Paragraph> <Paragraph position="5"> We shall generate these examples by assuming that &quot;which z licenses extraction from any position in the body of the relative clause. We may accomplish this by giving Uwhich~ the type (N\N)/(S/ANP) (cf. the extraction operator T of Moortgat (1988)). This allows the derivations in (20a-b) (see Figure 1), which correspond to the lambda-terms in (21a-b) respectively: null (21) a. which (Az\[(read z) Suzy\]) b. which (Az\[(qulckly (read z)) Suzy\])</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> ITERATION </SectionTitle> <Paragraph position="0"> ties that differs from the present proposal in several respects. There aretwo unidirectional commutation modal.</Paragraph> <Paragraph position="1"> itiea ratheC/ than the single bidirectional sodality given here, and a single operational rule is associated with each of the universal modalities. We ahto suggest a (more tentative) system of swlstenfial modalltles for dealinl$ with elements that are themselves commutable, iterable or optional.</Paragraph> <Paragraph position="2"> One or more occurrences of items of type X in any position may be derived from an item of type X ~.</Paragraph> <Paragraph position="3"> We may use this modality in t treatment of multiple C/xtraction. Consider tits parasitic gap construction in (23): (23) (Here is the paper) which Susy read without understanding.</Paragraph> <Paragraph position="4"> In order to generate both this example and the ones iu (19), we shall now assume that ~which&quot; licenses extraction not just from any position in the body of a relative clause, but from any number of positions greater than or squad to one. We may do this by altering the type of awhich ~ to (N\N)/(S/NPt). Since h~s 'all the inference rules of A, tl~e derivations in (20) v~iil still go througl, with the new type. In addition~ the icon inference ~ule allows the derivation of (23) given in (24) (see Figure 1), and the correspondi,g term in (25/: The type X II is assigned to an item of typeX which may be freely permuted, iterated and omitted. I has the following inference rules:</Paragraph> <Paragraph position="6"> Zero or more occurrences of items of type X in any position may be derived from an item of type X ~\[.</Paragraph> <Paragraph position="7"> We n|ay use this modality in a treatment of op tional extraction, ors illustrated by (27): (27) a. (The paper was) ago long for Sezy to read. b. (The paper was) too long for Susy to read quickly.</Paragraph> <Paragraph position="8"> c. (The paper was) too long for Suzy to read witlmut understanding.</Paragraph> <Paragraph position="9"> d. (The paper was) too long for Suzy to con- null centrate.</Paragraph> <Paragraph position="10"> We shall ~ssume for simplicity that ato~-infinitives are single lexical item~ of typ~ Vp, that ~for-to&quot; clauses have a special atomic type ForP (so that Yfor ~ has the type (ForP/VP)/NP), and that predicate phrases have a special atomic type PredP. Given these assignments, the type PredP/(ForP/NP:) for &quot;too long s would allow (27a-c), but not (27d). In order to general.e all four examples, we shall a~sume that %00 long ~ liceuses extraction from any number of positions in the embedded cl/xuse greater than or equal to zero, aud thus give it the type PredP/(PorP/NPIl I. Again, g has all the inference r~les of I generating (2?a-c), and the Wkn It rule allows (27d) to be derived as in (28) (see Figure 1), giviug theterm in (291: (29) too-long (Az\[fo~ (to-concentrate SuzYl\] I</Paragraph> </Section> class="xml-element"></Paper>