XML Viewer - e06-1001

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/e06-1001_metho.xml
Size: 24,655 bytes
Last Modified: 2025-10-06 14:10:04
<?xml version="1.0" standalone="yes"?>
<Paper uid="E06-1001">
  <Title>pl RES ARG NPsbj &gt;T RES RES S RES S SLASH f ARG RES S ARG ARG Nompl ARG SLASH b &gt;B RES S ARG NPobj SLASH f &gt;S Figure 5: An I-CCG derivation</Title>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1 The CCG formalism
</SectionTitle>
    <Paragraph position="0"> In its most basic conception, a CCG over alphabet S of terminal symbols is an ordered triple &lt;A,S,L&gt; , whereAis an alphabet of saturated category symbols,S is a distinguished element ofA, and L is a lexicon, i.e. a mapping from S to categories overA. The set of categories over alphabet A is the closure of A under the binary infix connectives/and \ and the associated 'modalities' of Baldridge (2002). For example, assuming the saturated category symbols 'S' and 'NP', here is a simple CCG lexicon (modalities omitted): John turnstileleft NP(1) Mary turnstileleft NP loves turnstileleft (S\NP)/NP The combinatory projection of a CCG lexicon is its closure under a finite set of resource-sensitive combinatory operations such as forward application (2), backward application (3), forward type raising (4), and forward composition (5):</Paragraph>
    <Paragraph position="2"> CCG &lt;A,S,L&gt; over alphabet S generates string s [?] S[?] just in case &lt;s,S&gt; is in the combinatory projection of lexicon L. The derivation in Figure 1 shows that CCG (1) generates the sentence John loves Mary, assuming that 'S' is the distinguished symbol, and where &gt;T, &gt;B and &gt; denote instances of forward raising, forward composition and forward application respectively:</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="2" type="metho">
    <SectionTitle>
2 Lexical redundancy in CCG
</SectionTitle>
    <Paragraph position="0"> CCG has many advantages both as a theory of human linguistic competence and as a tool for practical natural language processing applications (Steedman, 2000). However, in many cases development has been hindered by the absence of an agreed uniform approach to eliminating redundancy in CCG lexicons. This poses a particular problem for a radically lexicalised formalism such as CCG, where it is customary to handle bounded  dependency constructions such as case, agreement and binding by means of multiple lexical category assignments. Take for example, the language schematised in Table 1. This fragment of English, though small, exemplifies certain non-trivial aspects of case and number agreement:  This lexicon exhibits a number of multiple category assignments: (a) the proper noun John and the second person pronoun you are each assigned to two categories, one for each case distinction; (b) the plural suffix -s is assigned to three categories, depending on both the case and 'bar level' of the resulting nominal; and (c) the definite article the is assigned to four categories, one for each combination of case and number agreement distinctions. Since in each of these three cases there is no pretheoretical ambiguity involved, it is clear that this lexicon violates the following efficiency-motivated ideal on human language lexicons, in the Chomskyan sense of locus of non-systematic information: ideal of functionality a lexicon is ideally a function from morphemes to category labels, modulo genuine ambiguity Another efficiency-motivated ideal which the CCG lexicon in Table 2 can be argued to violate is the following: ideal of atomicity a lexicon is a mapping from morphemes ideally to atomic category labels In the above example, the transitive verb love is mapped to the decidedly non-atomic category label (S\NPplsbj)/NPobj. Lexicons which violate the criteria of functionality and atomicity are not just inefficient in terms of storage space and development time. They also fail to capture linguistically significant generalisations about the behaviour of the relevant words or morphemes.</Paragraph>
    <Paragraph position="1"> The functionality and atomicity of a CCG lexicon can be easily quantified. The functionality ratio of the lexicon in Table 2, with 22 lexical entries for 14 distinct morphemes, is 2214 = 1.6. The atomicity ratio is calculated by dividing the number of saturated category symbol-tokens by the number of lexical entries, i.e. 3622 = 1.6.</Paragraph>
    <Paragraph position="2"> Various, more or less ad hoc generalisations of the basic CCG category notation have been proposed with a view to eliminating these kinds of lexical redundancy. One area of interest has involved the nature of the saturated category symbols themselves. Bozsahin (2002) presents a version of CCG where saturated category symbols are modified by unary modalities annotated with morphosyntactic features. The features are themselves ordered according to a language-particular join semi-lattice. This technique, along with the insistence that lexicons of agglutinating languages are necessarily morphemic, allows generalisations involving the morphological structure of nouns and verbs in Turkish to be captured in an elegant, non-redundant format. Erkan (2003) generalises this approach, modelling saturated category labels as typed feature structures, constrained by underspecifiedfeaturestructuredescriptionsintheusual null manner.</Paragraph>
    <Paragraph position="3"> Hoffman (1995) resolves other violations of the ideal of functionality in CCG lexicons for languages with 'local scrambling' constructions by means of 'multiset' notation for unsaturated categories, where scope and direction of arguments can be underspecified. For example, a multiset category label like S{\NPsbj,\NPobj} is to be understood as incorporating both (S\NPsbj)\NPobj and (S\NPobj)\NPsbj.</Paragraph>
    <Paragraph position="4"> Computational implementations of the CCG formalism, including successive versions of the  Grok/OpenCCG system1, have generally dealt with violations of the ideal of atomicity by allowing for the definition of macro-style abbreviations for unsaturated categories, e.g. using the macro 'TV' as an abbreviation for (S\NPsbj)/NPobj.</Paragraph>
    <Paragraph position="5"> One final point of note involves the project reported in Beavers (2004), who implements CCG within the LKB system, i.e. as an application of the Typed Feature Structure Grammar formalism of Copestake (2002), with the full apparatus of unrestricted typed feature structures, default inheritance hierarchies, and lexical rules.</Paragraph>
  </Section>
  <Section position="4" start_page="2" end_page="3" type="metho">
    <SectionTitle>
3 Type-hierarchical CCG
</SectionTitle>
    <Paragraph position="0"> One of the aims of the project reported here has been to take a bottom-up approach to the problem of redundancy in CCG lexicons, adding just enough formal machinery to allow the relevant generalisationstobeformulated, whilstretaininga restrictive theory of human linguistic competence which satisfies the 'strong competence' requirement, i.e. the competence grammar and the processing grammar are identical.</Paragraph>
    <Paragraph position="1"> I start with a generalisation of the CCG formalism where the alphabet of saturated category symbols is organised into a 'type hierarchy' in the sense of Carpenter (1992), i.e. a weak order &lt;A,subsetsqequalA&gt; , where A is an alphabet of types, subsetsqequalA is the 'subsumption' ordering onA(with a least element), and every subset ofAwith an upper bound has a least upper bound. An example type hierarchy is in Figure 2, where for example types 'Nomsg' and 'NP' are compatible since they have a non-empty set of upper bounds, the least upper bound (or 'unifier') being 'NPsg'.</Paragraph>
    <Paragraph position="2">  A type-hierarchical CCG (T-CCG) over alphabet S is an ordered 4-tuple &lt;A,subsetsqequalA,S,L&gt; , where  symbols, S is a distinguished element of A, and lexicon L is a mapping from S to categories over  A. Given an appropriate subsetsqequalA-compatibility relation on the categories over A, the combinatory projection of T-CCG &lt;A,subsetsqequalA,S,L&gt; can again be defined as the closure of L under the CCG combinatory operations, assuming that variable Y in the type raising rule (4) is restricted to maximally specified categories.</Paragraph>
    <Paragraph position="3"> The T-CCG lexicon in Table 3, in tandem with the type hierarchy in Figure 2, generates the fragment of English in Table 1:  The T-CCG lexicon in Table 3 comes closer to satisfying the ideal of functionality than does the lexiconinTable2. Whilethelatterhasafunctionality ratio of 1.6, the former's is 1614 = 1.1.</Paragraph>
    <Paragraph position="4"> This improved functionality ratio results from the underspecification of saturated category symbols inherent in the subsumption relation. For example, whereas the proper noun John is assigned to two distinct categories in the lexicon in Table 2, in the T-CCG lexicon it is assigned to a single non-maximal type 'NPsg' which subsumes the two maximal types 'NPsgsbj' and 'NPsgobj'. In other  words, the phenomenon of case syncretism in English proper nouns is captured by having a general singular noun phrase type, which subsumes a plurality of case distinctions.</Paragraph>
    <Paragraph position="5">  TheT-CCGformalismisequivalenttothe'morphosyntacticCCG'formalismofBozsahin(2002), null where features are ordered in a join semi-lattice. Any generalisation which can be expressed in a morphosyntactic CCG can also be expressed in a T-CCG, since any lattice of morphosyntactic features can be converted into a type hierarchy. In addition, T-CCG is equivalent to the formalism described in Erkan (2003), where saturated categories are modelled as typed feature structures. Any lexicon from either of these formalisms can be translated into a T-CCG lexicon whose functionality ratio is either equivalent or lower.</Paragraph>
  </Section>
  <Section position="5" start_page="3" end_page="5" type="metho">
    <SectionTitle>
4 Inheritance-driven CCG
</SectionTitle>
    <Paragraph position="0"> A second generalisation of the CCG formalism involves adding a second alphabet of non-terminals, in this case a set of 'lexical types'. The lexical typesareorganisedintoan'inheritancehierarchy', constrained by expressions of a simple feature-based category description language, inspired by previousattemptstointegratecategorialgrammars and unification-based grammars, e.g. Uszkoreit (1986) and Zeevat et al. (1987).</Paragraph>
    <Section position="1" start_page="3" end_page="3" type="sub_section">
      <SectionTitle>
4.1 Simple category descriptions
</SectionTitle>
      <Paragraph position="0"> The set of simple category descriptions over alphabetAof saturated category symbols is defined as the smallest set Ph such that:  1. A [?] Ph 2. for alld [?] {f,b}, (SLASHd) [?] Ph 3. for allph [?] Ph, (ARGph) [?] Ph 4. for allph [?] Ph, (RESph) [?] Ph  Note that category descriptions may be infinitely embedded, in which case they are considered to be right-associative, e.g. RES ARG RES SLASH f. A simple category description like (SLASH f) or (SLASH b) denotes the set of all expressions which seek their argument to the right/left. A description of the form (ARGph) denotes the set of expressions which take an argument of category ph, and one like (RES ph) denotes the set of expressions which combine with an argument to yield an expression of categoryph.</Paragraph>
      <Paragraph position="1"> Complex category descriptions are simply sets of simple category descriptions, where the assumed semantics is simply that of conjunction.</Paragraph>
    </Section>
    <Section position="2" start_page="3" end_page="3" type="sub_section">
      <SectionTitle>
4.2 Lexical inheritance hierarchies
</SectionTitle>
      <Paragraph position="0"> Lexical inheritance hierarchies (Flickinger, 1987) are type hierarchies where each type is associated with a set of expressions drawn from some category description language Ph. Formally, they are ordered triples &lt;B,subsetsqequalB,b&gt; , where &lt;B,subsetsqequalB&gt; is a type hierarchy, andbis a function fromBtoP(Ph).</Paragraph>
      <Paragraph position="1"> An example lexical inheritance hierarchy over the set of category descriptions over the alphabet of saturated category symbols in Table 2 is presented in Figure 4. The intuition underlying these (monotonic) inheritance hierarchies is that instances of a type must satisfy all the constraints associated with that type, as well as all the constraints it inherits from its supertypes.</Paragraph>
      <Paragraph position="2">  This example hierarchy is a single inheritance hierarchy, since every lexical type has no more than one immediate supertype. However, multiple inheritance hierarchies are also allowed, where a given type can inherit constraints from two supertypes, neither of which subsumes the other.</Paragraph>
    </Section>
    <Section position="3" start_page="3" end_page="5" type="sub_section">
      <SectionTitle>
4.3 I-CCGs
</SectionTitle>
      <Paragraph position="0"> An inheritance-driven CCG (I-CCG) over alphabet S is an ordered 7-tuple &lt;A,subsetsqequalA,B,subsetsqequalB,b, S,L&gt; , where &lt;A,subsetsqequalA&gt; is a type hierarchy of saturated category symbols, &lt;B,subsetsqequalB,b&gt; is an inheritance hierarchy of lexical types over the set of categorydescriptionsoverA[?]B,Sisadistinguished null symbol inA, and lexiconLis a function from S to A[?]B. Given an appropriate subsetsqequalA,B-compatibility relationonthecategoriesoverA[?]B, thecombinatory projection of I-CCG &lt;A,subsetsqequalA,B,subsetsqequalB,b,S,L&gt; can again be defined as the closure ofLunder the  CCG combinatory operations.</Paragraph>
      <Paragraph position="1"> The I-CCG lexicon in Table 4, along with the type hierarchy of saturated category symbols in Figure 2 and the inheritance hierarchy of lexical types in Figure 4, generates the fragment of English in Table 1. Using this lexicon, the sentence  girls love John is derived as in Figure 5, where derivational steps involve 'cache-ing out' sets of constraints from lexical types.</Paragraph>
      <Paragraph position="2">  This derivation relies on a version of the CCG combinatory rules defined in terms of the I-CCG category description language. For example, forward application is expressed as follows -- for all compex category descriptions Ph and Ps such that (SLASH b) negationslash[?] Ph, and {ph  |(ARG ph) [?] Ph} [?] Ps is compatible, the following is a valid inference:</Paragraph>
      <Paragraph position="4"> The functionality ratio of the I-CCG lexicon in Table 4 is 1414 = 1 and the atomicity ratio is 1414 = 1.</Paragraph>
      <Paragraph position="5"> In other words, the lexicon is maximally nonredundant, since all the linguistically significant  generalisationsareencodablewithinthelexicalinheritance hierarchy. The optimal atomicity ratio of the I-CCG lexicon is a direct result of the introduction of lexical types. In the T-CCG lexicon in Table 3, the transitive verb love was assigned to a non-atomically labelled category (S\NPplsbj)/NPobj. In the I-CCG's inheritance hierarchy in Figure 4, there is a lexical type 'verbpl' which inherits six constraints whose conjunction picks out exactly the same category.</Paragraph>
      <Paragraph position="6"> It is with this atomic label that the verb is paired in the I-CCG lexicon in Table 4.</Paragraph>
      <Paragraph position="7"> The lexical inheritance hierarchy also has a role to play in constructing lexicons with optimal functionality ratios. The T-CCG lexicon in Table 3 assigned the definite article to two distinct categories, one for each grammatical number distinction. The I-CCG utilises the disjunction inherent in inheritance hierarchies to give each of these a common supertype 'det', which is associated with the properties all determiners share.</Paragraph>
      <Paragraph position="8"> Finally, the I-CCG formalism can be argued to subsume the multiset category notation of Hoffman (1995), in the sense that every multiset CCG lexicon can be converted into an I-CCG lexicon with an equivalent or better functionality ratio. Recall that Hoffman uses generalised category notation like S{\NPsbj,\NPobj} to subsume two standard CCG category labels (S\NPsbj)\NPobj and (S\NPobj)\NPsbj. Again it should be clear that this is just another way of representing disjunction in a categorial lexicon, and can be straightforwardly converted into a lexical inheritance hierarchy over I-CCG category descriptions. null 5 Semantics of the category notation In the categorial grammar tradition initiated by Lambek (1958), the standard way of providing a semantics for category notation defines the denotation of a category description as a set of strings of terminal symbols. Thus, assuming an alphabet S and a denotation function [[...]] from the saturated category symbols to P(S), the denotata of unsaturated category descriptions can be defined as follows, assuming that the underlying logic is simply that of string concatenation:</Paragraph>
      <Paragraph position="10"> This suggests an obvious way of interpreting the I-CCG category notation defined above. Let's  start by assuming that, given some I-CCG&lt;A,subsetsqequalA, B,subsetsqequalB,b,S,L&gt; over alphabet S, there is a denotation function [[...]] from the maximal types in the hierarchy of saturated categories &lt;A,subsetsqequalA&gt; to P(S). For all non-maximal saturated category symbols ph in A, the denotation of ph is then the set of all strings in any of ph's subcategories, i.e. [[ph]] = uniontextphsubsetsqequalAps[[ps]]. The denotata of the simple category descriptions can be defined by universal quantification over the set of simple category de-</Paragraph>
      <Paragraph position="12"> This just leaves the simple descriptions which consist of a type in the lexical inheritance hierarchy &lt;B,subsetsqequalB, b&gt; . If we define the constraint set of some lexical type ph [?] B as the set Ph of all category descriptions either associated with or inherited by ph, then the denotation of ph is defined asintersectiontext ps[?]Ph[[ps]].</Paragraph>
      <Paragraph position="13"> Unfortunately, this approach to interpreting I-CCG category descriptions is insufficient, since the logic underlying CCG is not simply the logic of string concatenation, i.e. CCG allows a limited degree of permutation by dint of the crossed composition and substitution operations. In fact, there appears to be no categorial type logic, in the sense of Moortgat (1997), for which the CCG combinatory operations provide a sound and complete derivation system, even in the resource-sensitive system of Baldridge (2002). An alternative approach involves interpreting I-CCG category descriptions against totally well-typed, sort-resolved feature structures, as in the HPSG formalism of Pollard and Sag (1994).</Paragraph>
      <Paragraph position="14"> Givensometypehierarchy&lt;A,subsetsqequalA&gt; ofsaturated category symbols and some lexical inheritance hierarchy &lt;B,subsetsqequalB,b&gt; , we define a class of 'category models', i.e. binary trees where every leaf node carries a maximal saturated category symbol inA, every non-leaf node carries a directional slash, and every branch is labelled as either a 'result' or an 'argument'. In addition, nodes are optionally labelled with maximal lexical types from B. Note that since only maximal types are permitted in a model, they are by definition sort-resolved. Assuming the hierarchies in Tables 2 and 4, an example category model is given in Figure 6, where arcs by convention point downwards:  Given some inheritance hierarchy &lt;B,subsetsqequalB,b&gt; of lexical types, not all category models whose nodes are labelled with maximal types fromB are 'welltyped'. In fact, this property is restricted to those models where, if node n carries lexical type ph, then every category description in the constraint set of ph is satisfied from n. Note that the root of the model in Figure 6 carries the lexical type 'verbpl'. Since all six constraints inherited by this type in Figure 4 are satisfied from the root, and since no other lexical types appear in the model, we can state that the model is well-typed.</Paragraph>
      <Paragraph position="15"> In sum, given an appropriate satisfaction relation between well-typed category models and ICCGcategorydescriptions,alongwithadefinition null of the CCG combinatory operations in terms of category models, it is possible to provide a formal interpretation of the language of I-CCG category descriptions, in the same way as unification-based formalisms like HPSG ground attribute-value notation in terms of underlying totally well-typed, sort-resolved feature structure models. Such a semantics is necessary in order to prove the correctness of eventual I-CCG implementations.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="5" end_page="6" type="metho">
    <SectionTitle>
6 Extending the description language
</SectionTitle>
    <Paragraph position="0"> The I-CCG formalism described here involves a generalisation of the CCG category notation to incorporate the concept of lexical inheritance. The primary motivation for this concerns the ideal of non-redundant encoding of lexical information in humanlanguagegrammars, sothatallkindsoflinguistically significant generalisation can be captured somewhere in the grammar. In order to fulfil thisgoal, thesimplecategorydescriptionlanguage defined above will need to be extended somewhat.</Paragraph>
    <Paragraph position="1"> For example, imagine that we want to specify the  set of all expressions which take an NPobj argument, but not necessarily as their first argument, i.e. the set of all 'transitive' expressions:  It should be clear that this category is not finitely specifiable using the I-CCG category notation.</Paragraph>
    <Paragraph position="2"> One way to allow such generalisations to be made involves incorporating the [?] modal iterationoperatorusedinPropositionalDynamicLogic null (Harel, 1984) to denote an unbounded number of arc traversals in a Kripke structure. In other words, category description (RES*ph) is satisfied from nodenin a model just in case some finite sequence of result arcs leads fromnto a node where ph is satisfied. In this way, the set of expressions taking an NPobj argument is specified by means of the category description RES* ARG NPobj.</Paragraph>
  </Section>
  <Section position="7" start_page="6" end_page="6" type="metho">
    <SectionTitle>
7 Computational aspects
</SectionTitle>
    <Paragraph position="0"> At least as far as the I-CCG category notation defined in section 4.1 is concerned, it is a straight-forward task to take the standard CKY approach to parsing with CCGs (Steedman, 2000), and generalise it to take a functional, atomic I-CCG lexicon and 'cache out' the inherited constraints online. As long as the inheritance hierarchy is non-recursive and can thus be theoretically cached out into a finite lexicon, the parsing problem remains worst-case polynomial.</Paragraph>
    <Paragraph position="1"> In addition, the I-CCG formalism satisfies the 'strong competence' requirement of Bresnan (1982), according to which the grammar used by or implicit in the human sentence processor is the competence grammar itself. In other words, although the result of cache-ing out particularly common lexical entries will undoubtedly be part of a statistically optimised parser, it is not essential to the tractability of the formalism.</Paragraph>
    <Paragraph position="2"> One obvious practical problem for which the work reported here provides at least the germ of a solution involves the question of how to generaliseCCGlexiconswhichhavebeenautomatically null induced from treebanks (Hockenmaier, 2003). To take a concrete example, Cakici (2005) induces a wide coverage CCG lexicon from a 6000 sentence dependency treebank of Turkish. Since Turkish is a pro-drop language, every transitive verb belongs to both categories (S\NPsbj)\NPobj and S\NPobj.</Paragraph>
    <Paragraph position="3"> However, data sparsity means that the automatically induced lexicon assigns only a small minority of transitive verbs to both classes. One possible way of resolving this problem would involve translating the automatically induced lexicon into sets of fully specified I-CCG category descriptions, generating an inheritance hierarchy of lexical types from this lexicon (Sporleder, 2004), and applying some more precise version of the following heuristic: if a critical mass of words in the automatically induced lexicon belong to both CCG categories X and Y, then in the derived I-CCG lexicon assign all words belonging to either X or Y to the lexical type which functions as the greatest lower bound of X and Y in the lexical inheritance hierarchy.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML