File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2027_metho.xml

Size: 14,304 bytes

Last Modified: 2025-10-06 14:12:25

<?xml version="1.0" standalone="yes"?>
<Paper uid="C90-2027">
  <Title>A Linguistic Theory of Robustness *</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 The WACSG approach
</SectionTitle>
    <Paragraph position="0"> \VACSG (Weak ACSG) is an experimental forrealism for defining robust grammars, ACSG (Annotated Constituent Structure Grammar) is a class of two-level grammar formalisms such as degThe work reported has been supported by an LGF grant from the Land Baden-VC/h'ttemberg. For valuable comments on art earlier draft of this paper Imn indebted to Christian Rohrer and Tobias Goeser, M,'G \[11\], DCG \[12\] a.nd PAT\[L-II \[13\]. Nevertheless, WACSG weakness concepts may also Iw.</Paragraph>
    <Paragraph position="1"> iml)hmlent.ed in monostratal formalisms as ~.g.</Paragraph>
    <Paragraph position="2"> tIPSG \[\]4\]. WAC.S&lt;~ is dedicated to synlacti(;al robustness, and not. to lnorphosyntacbic (spelling correction), semau{k: or l,ragmat, ic robusthess. '\]'his does not preclude scmaut.ics and/or pragmatics f'rom resolving robustness conflic s.</Paragraph>
    <Paragraph position="3"> For a \,VA().S(.~-grammar f!'agme,/t to be robust, its formalism's weakness is necessary l,nt, not sm'Iicient and its adequacy w.r.t, defecti.vC/~ !anguage is necessary but not sufficient. Robustness theory is to show that defective lang~ ~)'? is cxactl.q the language described by &amp;quot;wealC/&amp;quot; des&lt; riplion methods. Any less metaphorical constrtlction of the notion of weakness needs a conside- rable formal apparatus.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 The WACSG Formalism
</SectionTitle>
    <Paragraph position="0"> A WACSG grammar rule is a context Dee production annotated with an attribute-value- (av) formula. The following two subsections deal with weakness relations for context free grammars and av-languages. Section 3.3, then, sp,~ci ties the \\&amp;quot;ACSG formalism semanlics.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Partial String Languages
</SectionTitle>
      <Paragraph position="0"> Below (1), three part, lid st, ring languages of a context-free grammar G =&lt; Cal, I.e,~:, Pr,,q's(~ &gt; are defined, where Cat and Lcx are sets of nonterrninal and terminal symbols, respectively, Pr a set. of productions and &gt;;set a set. of start symbols. Now let I'w a set of substrings of w and PP~ a set of power-substrings of w with any 'w ~ E PP,~ resulting fl'om deletion of a.rbitrary subst,a'ings in w. If \[w\] &gt; 0, then t~0 and P\]~) must not contain e. Z~ and ZZw are partition flmetions in Pw and I)t~ respe(:tiwqy. More simply, ,~'\]'2T((/) equa.ls L(G) +. SUB((J) allows an undefined leftside and/or rightside snbstring and PAI~(G) even undefined infix substrings for every element from L(G).</Paragraph>
      <Paragraph position="2"> Partial string languages have appealing formal properties: C/(G) for C/ E {SET, SUB, PAR} is context-free, contains ciff L(G) contains c and there is an order L(G) C SET(a) c SUB(G) c_ PAR(G). Nesting partial string languages introduces a set ~(G) of languages such as e.g. SET(SUB(G)),SET(PAR(G)). We have IC/C/(G)l = 1, i.e. a lie languages with maximal ope.rator C/ are weakly equivalent, though not.</Paragraph>
      <Paragraph position="3"> pairwise strongly equivalent.</Paragraph>
      <Paragraph position="4"> A recurs+-re partial string grarnmar (RPSG) is obtained by indexing rights+-de (nonterminal) symbols of a cfg G with indices SET, SUB, or PAR. The formalism- semantics for an RP~G is given by a derivation relation (cf.\[15\]) for non-indexed and SET-indexed nodes of a tree graph and by a generation function gen as displayed in 2 for any other nodes. Let Q(G) the set of derivations for a given G, w E Q(G) a derivation and tw its tree graph. Let lw be a label function with l~o(O) 6 Ssetind a (possibly indexed) start symbol ~ The languages L(G) (derived language) and RPSL(G) (generated language) are defined in .3. L(G) and RPSL(G) are context-free and we have L(G) C RPSL(G), L(G) usually being much smaller than RPSL(G).</Paragraph>
      <Paragraph position="5"> (2)</Paragraph>
      <Paragraph position="7"/>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Attribute-Value Languages
</SectionTitle>
      <Paragraph position="0"> The av-language c9 is a first order predicate logic including l-dry function symbols and two 2dry predicates &amp;quot;~&amp;quot; and &amp;quot;E&amp;quot;for equality and set membership, respectively. Soundness and completeness of 0 without E have been proven in \[16\]. The predicate &amp;quot;E&amp;quot; introduces well-founded, distributive, recurs+-re sets of attribute-valuestructures, and is discussed in \[17\] . We assume the existence of a reduction algorithm  RNF with RNF(A) E O, iff is A satisfiable and RNF(A) = 2_ otherwise (for any formula AE!O) 2 1By notational convention, it is Catind C_ Cat x {SET, SUB,PAR} and by definition of RPSG, it. is Ssetind ~ ~a-tind, 2RNF(A) is in disjunctive normal form, such that DNF(RNF(A)) : RNF(A)  Robustness in the area of av-languages is the ability to cope with inconsistent (i.e. overspeeifled) formulae. Two different methods for maintaining consistency will be considered, namely set weakening and default formulae.</Paragraph>
      <Paragraph position="1">  In robustness theory, the purpose of av-sets is to weaken the flmction condition on dr- structures. Set weakening may be used e.g. tbr the transition from an inconsistent formula A = x(syn)(case) ~ nora A x(syn)(case)~ akk to a consistent (therefore non-equivalent) formula A x(syn)(case)=xl ~ nom Axa~ akk As1 E x(syn)(case) A x2 C x(syn)(case). This U'ansition preserves case information, but not inconsistency for the denotatmn J\[x~ . In general, set  weakening is defined as follows: (4) Let A E cO a fonmfla in disjunctive nor-mal form and t a non-constant tenn. Let. L C/ A {Ai,j.k It occurs k-times in a literalAi.j} a set of indices. For any r E L~t , zr is a variable not occuring in A. The set weakening of A for a term t is For any A 6 cO and non-constant term t it has been shown (see \[17\]) that, if RNF(A) = A # +-, then also DNF(At) = RNF(A ~) # 1. Therefore, if A is satisfiable, then A t is also satisfiable. Since satisfiability of A does not follow from satisfiability of A ~ (see above), A t is weaker-of equivalent to A. However, the theoretically motivated Aqnotation has not been integrated into WACSG formalism, since set weakening can be achieved by using the predicate &amp;quot;6&amp;quot;.</Paragraph>
      <Paragraph position="2">  rPShe classical subsurnption ~_ gives a partial ordering within tile set of av-models. There are, however, no inconsistent models. Therefore, a partiality notion with inconsistency must be based upon descriptions i.e. av-formulae. The relation 3-partial _C 0 ~ is a subsumption-isomorphisrn into a (canonical) subset of 0. The relation 0partial defined below is still weaker in allowing inconsistency of one formula B and can be shown to be a superset of 3-partial, i.e. 3-partial C_ 0partial. null Let I 6 0 aconjunctionofliterals, and A,B 6  CO . Then A 0-partial B iff: 1. RNF(A A I) C/ RNF(A) 2. RNF(A A I) = RNF(B), if RNF(B) 7! 2</Paragraph>
      <Paragraph position="4"> conjunction of default literals, whose predicate is marked with a. subscript a. This gives a default relation, which is a subset of a superset of subsumption between formulae. A relation of default-satisfiability &amp;quot;l=a&amp;quot; may be based upon this default relation. It is easy to demonstrate that a default-relation like this has some desired disambiguation properties: a disjunctive formula A = A1 VA2 is reduced to RN.F(A1 A I) by conjoining it with a default formula I G 0 such that RNF(A2 A I) = I.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 WACSG formalism se-
</SectionTitle>
    <Paragraph position="0"> mantics For any WACSG-Grammar G, a domain D(G) and its subset SDDE(G) of strictly derivable domain elements is defined as follows. Any domain element not in SDDE(G) bears weakhess relations to a derivation co C Q(G) ,</Paragraph>
    <Paragraph position="2"> l)e\['ault-formulae and set. membershi I) formulae cannot be simulated by anything else in WACSG formalism. For every WACSG grammar G, however, there is an equivalent WACSG grammar G' without any partial string indices within it.</Paragraph>
    <Paragraph position="3"> This grammar G' shows an extreme complexity already for a few indices in G. This fact challenges the view (see e.g. \[8\]) that robustness can be achieved by coverage extension of any nonweakeable ACSG.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 A WACSG-treatment of
</SectionTitle>
    <Paragraph position="0"> restarts in t;his section, the WACSG formalism is applied to restarts, a class of spoken language construetions, which is often referred to in robustness literature \[2,3,4\]. A grammatical explanation, however, is still lacking. The German restart data in 7 are given with transliteration and segmentation. Constructions in 7,8 are ungrammatical, but not inacceptable.</Paragraph>
    <Paragraph position="1">  From the viewpoint of robustness theory, a restart &lt; a/3 A S~7, M &gt; 6 D(G) should not be in SDDE(G) exactly if it is defect, where G is a realistic WACSG fragment of the language in question. Roughly, restarts are a kind of phrasal coordination not allowing for deletion phenomena such as ellipsis, gapping or left deletion. Additionally, the ~-substring (i) does not contribute to (extensional) meaning a of the construction and, (ii), may show recursive defectivenesses such as contanfination and constituent break  (examples 9,10).</Paragraph>
    <Paragraph position="2"> (9) dab er \[dieses Meinung ~4 dieser Meinung\] ist that he \[ this-neuter opinion-fern A this-fern opinion-fern \] has (10) Peter ist \[ ins in das A dann Vater gewesen\] Peter is \[in-the in the A then father been \]</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 NP-restarts
</SectionTitle>
      <Paragraph position="0"> The following WACSG rules 11-14 deal with openly coordinated NP restarts and are easily generalized to prepositional, adverbial or adjectival phrase restarts. Under the coordination hypothesis, a parallelism between defective and non-defective restarts is assumed. Right-recursive coordination of defective and nondefec- tive conjuncts is unrestricted. In 11, equations simulating semantic and syntactic projections (see \[18\]) &amp;quot;control up&amp;quot; the syntactic but not, the semantic description of afl conjunct in a restart construction.</Paragraph>
      <Paragraph position="1"> In rules 13,14, partial string indices ,sef3r~ und PAH allow a defect conjunct to cover a prefix substring (if no phonological restart marker of category AC is present) or every substring (if there is a restart marker).</Paragraph>
      <Paragraph position="2"> aHowever, it does contribute t,o meaning in an intensional sense: ~3-substrings are not, absurd.</Paragraph>
      <Paragraph position="3"> 158 3 i~,ulc 1.1 applies set weakening to the syntactic av structures of both conjuncts, resulting in a well-l:nown coordina.tion treatment \[19\]. Default eqtmtions provide disaml)iguation to syntactic features \[~x:(syn)(case)~ and ~x:(syn)(gender)\]\] , since defectiviV may render the first conjunct ambiguous 4. Furthermore, rule 15 shows default weakening of the syntactic description of NP's.</Paragraph>
      <Paragraph position="5"> Within a simulated projection theory, controlling down a verbal argument into a vcomp-embedded element of a.n av-set requires a complex regular term x( syn )+\[ (vcomp)(syn) +\]*, which is expensive to compute. Therefore, rules 16,18 introduce an additional term x(kosem), such that \[\[;c(kosem)\]\] is the semantic structure of a set Ix\] of openly coordinated av.-structures. By default satisfiability of x(sem) ~d x(kosem), \[\[x(sem)~ equals ~\[x(kosem)~ except if \[\[x~ is the av set of a non-restart coordination.</Paragraph>
      <Paragraph position="6"> Since defectivity, e.g. a constituent, break.</Paragraph>
      <Paragraph position="7"> may render incomplete the fl verbal phrase incomplete, rule 17 provides semantic default values for every possible semantic ar- gument. l)istribut.ed av formulae may be necessary for one conjunct but inconsistent with (the description of) the other. This situation ma.y arise due to contamination of the first: (fl-) conjunct. Independently it can be shown that contaminations almost exclusi- vely affect syntactic (as opposed to semantic) t~atures. Now, if the conditions coherence and completeness (see \[11\]) are detined on semantic structure, syntactic coherence can be inforced by lexicalized formulae as shown in 19 that depend on a syntactic defectivity feature ~x(syn)(defec)\]  coordination of defective and no\]&gt;defe(-live con-- juncl;s. The coi@Inct NP des 1)clef shows a cont.a\]ninated case fea~i.ir{::, sillce des has genii.ire and Peter has \]~ominative., accusative or dative morphological case markil~g. Neverthdess, re,,,ark that lO.2.1(syn)(casc)~l is disambiguat{,.d i.o nominative in the av-slructure in C1.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 VP-restarts
</SectionTitle>
      <Paragraph position="0"> Although VP-restarts follow the same lines as NP-restarts, open coordination of detective conjuncts imposes additional problems 5.</Paragraph>
      <Paragraph position="1"> 'll&amp;quot;or any av-term t, It\] is the denotation of t (in |he modcq in question).</Paragraph>
      <Paragraph position="2"> &amp;quot;'A coordination construct.ion is calfed open ifl&amp;quot; there is a constituent whose av structure is distributed over the syntactic av-set assigned {.o this construction.</Paragraph>
      <Paragraph position="4"> The example C2 (appendix) involves a distributed av-structure, whose description is inconsistent with respect to syntactic case subcategorization of fl's finite verb gefiillt.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML