File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2145_metho.xml

Size: 11,031 bytes

Last Modified: 2025-10-06 14:13:44

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-2145">
  <Title>ON TIIE PORTABILITY OF COMPLEX CONSTRAINT-BASED GRAMMARS</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Formalisms
</SectionTitle>
    <Paragraph position="0"> In our experiments to explore the portability of complex constraint-ba.qed grammars we have considered a sample of four imt)lemented formalisms:  . UI) (Unillcation Device) \[Johnson and l{.osner, 1989, lt.upp cl ql., 1992\] 2 .</Paragraph>
    <Paragraph position="1"> * &amp;quot;FFS (Typed l&amp;quot;eature Structures) \[l,hnele and Zajac, 1990\].</Paragraph>
    <Paragraph position="2"> * CUI&amp;quot; (Comprehensive Unification i&amp;quot;ormalism) \[l)i;rre and l';isele, 199l, l)grre and l)orna, 1993\] * ALE (Al, tribnte l,ogic Engine) \[Carpenter, 195)2\]  The original reason for selecting this sample was practical: the availability of these systems in tile put)lie domain at; the alqu'opriate time3; but on filrther reflection this sample turns ()tit to be quite representative of the major differences which may occur in formalisms of this type (cf the very coarse-grained ela.ssilieation in Tal)\[e 1)4 The consequences of these distinctions are explored ira more detail below.</Paragraph>
    <Paragraph position="3"> The nal, ure of experinmnts in portability requires nol, otdy tim selection of source and target tbrmalisms, but also of exatnl)le descriptions to be translat.ed. In this respect we opted for taking grammars &amp;quot;fronl the wild&amp;quot;, i.e. native, code from one of the sample for malis,ns that was not designed with ally prior consideration of its potential portal)ility. To be more precise, we have worked with a small, lint formally represen(.alive IIPSG grammar, originally provided as sample data with the TI&amp;quot;S system, and a somewhat larger and quite intricate gl) grammar of French, which touches on such thorny issues ;us clitic placement and object agreement. 'Fhe init, ial experiments were iu translating the TFS grammar into (Jl), and then subsequently inl, o the other two formalisms. Our attempts to trails-late the ul) French gramtnar into ALl&amp;quot;, were Ilot quite as successful, as a substantive alteration to tit(', struco lure of the syntactic analysis proved necessary, The situation with CUF is more. promising, even though the delinition of an explicit parsing strategy within the formalism was required. 'fires(: two issues are dis-cussed further in Section 4.</Paragraph>
    <Paragraph position="4"> 2\[,'or the purposes of this paper wc see no significant differences bel, wecn UD and its derivatiw~&amp;quot; El,U, ,uee e.g. \[lgstival., 19510\].</Paragraph>
    <Paragraph position="5"> awe did toy with the idea of entitling this paper: &amp;quot;OIi' the OUI,&amp;quot; reread'ks on how much AI,I'; IJI) need to make sense of ~L TIeS grmnmar&amp;quot;, but thought better of it.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="902" type="metho">
    <SectionTitle>
3 Expressivity
</SectionTitle>
    <Paragraph position="0"> The underlying assumption that; is crncial to the nature of this work is that these formalisms have highly comparable expressivity, i.e. they share more than sel&gt; arates them. This is central to the success of the enterprise since preservation of concepts defined by the linguist is an essential part of grammar translation.</Paragraph>
    <Paragraph position="1"> Consequently, we are I)articularly concerned here with the main constructs of a linguistic description: types, relations and lists. We also consider, though to a lesser exl,el,t, purely notational devices like lnacros, which can be useful in organising the conceptual structure of a description. Of lesser importance in the present context. is the treatment of logical structure, in particular disjunction; iu any case, this topic has received a good deal of attention elsewhere (cf \[Trost, 1993\]).</Paragraph>
    <Section position="1" start_page="0" end_page="901" type="sub_section">
      <SectionTitle>
3.1 Types
</SectionTitle>
      <Paragraph position="0"> 'Fhc role of f~at, ure structure types in constraint-based linguistics t1~ gained increasing importance as a result of the increa~slng popularity, some might s~y donlinance, of IIPSG \[Pollard and Sag, 1987, Pollard and Sag, forthcoming\]. In HPSG the type system, or type signature, plays a signitlcant role in deilning the (:lass of legal linguistic objects. In fact in the current version of the theory only objects wlmse typing information is fldly resolved are considered to be adequate models of naturally occurring linguistic constructs. Each of the formalisms we consider permits the delinition of feature structure types, but the form and expressivity of these type definitions differ quite considerably, a~s does the significance of type defini-.</Paragraph>
      <Paragraph position="1"> tions in the description a.s a whole. The extreme cases are TFS, in which the type system is virtually all there is, and ol), where type dellnitions simply constrain the attributes which can occur on a feature structure.</Paragraph>
      <Paragraph position="2"> At this point we should note that a type system in the &amp;quot;true&amp;quot; or IIPSG sense, requires a notion of type inheritance which can be further subdivided into three  Type detinitions which form a type system usually encode immediate subtypes and feature appropriateness conditions, which specify, at le~Lst, tire attributes which  head = subst I funct, head(X): !subst(X) subst = noun I verb I adj I prep.</Paragraph>
      <Paragraph position="3"> subst\[PKD:boolean\].</Paragraph>
      <Paragraph position="4"> noun \[CASE: case\] .</Paragraph>
      <Paragraph position="5"> verb\[VFOKM:vform, AUX: boolean, INV: boolean\].</Paragraph>
      <Paragraph position="6"> Figure h A fragmentary type system rootcd in head and written in TFS  are licensed by the type and the types of their values, as in Figure 1. Closure is usually a derived notion, in that only attributes licensed by the type or one of its supertypes may occur, an unlicensed attribute incurring either further subtyping or inconsistency. UD type definitions cannot of themselves be used to define a hierarchical type system. They give an entirely fiat system with the most absolute closure and the most minimal appropriateness conditions. The type definitions of the other formalisms, TFS, CUF and ALE, differ mainly in the expressivity of their appropriateness conditions, in order of decremsing expressivity, cf \[Manandhar, 1993\] for a more detailed comparison of these type systems.</Paragraph>
      <Paragraph position="7"> Evidently, one of the most basic hurdles to translating any of the other formMisms into UD is the reconstruction of the type system. This was the problem posed in our initial experiment of porting an IIPSG grammar encoded in TFS into up. Our solution to this problem, cf Figure 2, consists of separating out the hierarchies of sub- and supertype dependencies from those of feature appropriateness, so that each node in the type hierarchy is represented by two unary abstraction definitions in the UP encoding. UD types ~ are only utilised on the terminal nodes of the type hierarchy to ensure ultimate closure. In principle the use of any pseudo-type definition will work its way down the dependency hierarchy to the terminal node and then back up the appropriateness hierarchy to gain more information. While this sounds dreadfully inefficient the lazy evaluation strategy adopted in UD in fact avoids most of the computational overhead.</Paragraph>
    </Section>
    <Section position="2" start_page="901" end_page="902" type="sub_section">
      <SectionTitle>
3.2 Relations
</SectionTitle>
      <Paragraph position="0"> The other main constructs used for expressing linguistic concepts are relations - or more specifically definite relations since most of these formalisnls are in fact instantiations of the tIShfeld and Smolka notion of a Constraint Logic Programming language \[tI6hfcld and Smolka, 1988\]. While the same essential notion occurs in all thcse formalisms the terminology is quite</Paragraph>
      <Paragraph position="2"> diverse, including, for instance, relational abstractions (UD) and parametric sorts (CUF). In fact in TFS relational constructs actually take the form of types with features expressing their argument structure, although a relational notation is provided to sweeten the syntax slightly. Since definite relations occur in each of the formalisms, their translation does not pose any immediate problems, and many of their usages are the same, e.g. accounting for relational dependencies and principles in llPSG-style grammars, cf Figure 3. Difficulties do however occur where the usage of relational constructs is restricted. ALE imposes the restriction that true definite relations may only be used in the phrasal domain, attached to phrase structure rules. On first impression, this could pose a serious problem for translations from other formalisms where relations may be used freely in the lexicon. Our experience has shown that many such lexical relations can in fact be encoded using ALE macros, as in Figure 4, which may be parameterised, but require a deterministic expansion. Where operations involving reeursive or disjunctive relations are required there is still the option of encoding the construct as a lexical rule, though with the risk of losing some of the conceptual structure.</Paragraph>
      <Paragraph position="3"> hfp(synsem: loc: cat: head: Head) := synsem: loc: cat: head: Head.</Paragraph>
    </Section>
    <Section position="3" start_page="902" end_page="902" type="sub_section">
      <SectionTitle>
3.3 Lists
</SectionTitle>
      <Paragraph position="0"> The last cbuss of constructs that we consider in detail arc' lists, or sequences. Our objective here is slightly different than in the last two c~mes, since all the formalisms support lists and most even supply the same, Prolog-style, notation. There is however a more subtie difference between uB and the more strongly typed forrnalisms, since in all the other formalisms the list notation is purely syntactic and masks a typed feature structure that is either atomic or has two attributes.</Paragraph>
      <Paragraph position="1"> \[n UP where lists are &amp;quot;real&amp;quot; objects, the nnitier is more explicitly polynlorl)hie , \])lit also admits tin; provision of built-in functions over sequence data-types, whose computational behaviour is more predictable than that of defined constructs like relations. Ul) prorides both append and member (or perhaps better &amp;quot;ex-tract&amp;quot;) over lists and since strings are also a fldl data type concal, enation over strings. The elfects on perlornrance of hard-coding frequenl,ly used con struets can be quite dramatic. We do not pursue this question here since the tmsociated design issues are COml)atral)le with those associated with the decision to incorporate dedicated modnles which are discussed ill the next section.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML