File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/p88-1024_metho.xml
Size: 19,908 bytes
Last Modified: 2025-10-06 14:12:13
<?xml version="1.0" standalone="yes"?> <Paper uid="P88-1024"> <Title>GERMAN: FRENCH: F.NGLISH: NUMBER: GENDER: CASE: DEFINrrE: ANNEXED SG DU PL3 M F NOM ACC GEN /- + - SG PL M F N NOM ACC GEN DAT SG PL M F SG PL</Title> <Section position="3" start_page="196" end_page="199" type="metho"> <SectionTitle> %SG-WO RD- FEATS-ARE-TOP-FEATS $SG-LEX &quot;,,,/ JA-N EN-N FR-N GEoN AR-N </SectionTitle> <Paragraph position="0"> FISUm 3. C~nerai N A few more remarks about the notation follow. A value can be either atomic (e.g N), a disjunction of atomic: values enclosed in curly brackets (e. 8. {N P\]), or a complex feature structure. It can also be umi~ffied (\[ D. The identity of two or more values is fo~.~d by reenmmt structmm indicated by coindexing (e.g. I\[ \] and <I>). Such coreferring value slots automatically point to a sin81e data structure entered through any one of the slots. Universal mono-level category N Category N: We posit the universal categmy N for nominals. Nominals here are those that realize AR~ such as subjects and objects. Nominals are more commonly labeled NP, a phrase typically built axound N or CN (comm*~ noun), as in phrase structure NP->DET N as well as in the categorlal grammar characterization of DET as a functor NPICN (Le. combines with CN and builds NP) (e.g. Ades & Steedm~n 1982; Wittenberg 1986a). This BI.LEV\]~ View of nominals is motivated by facts in western European languages. In English, for instance, while cat or wide cat cannot f'dl a subject position, a cat and thLv ca: can. In comrast, while he can be a subject, it cannot be modified as ~ he or srange h~. This motivates the following category-assJguments with a constraint that only NPs can be arguments: ca: is CN, he is NP, a and #~s are NP/CN, and white and sWange are CN/CN. This, bewevef, requires that plurals and mass nouns be CN and NP at the sanlc time since ca~, gold, white cats, white gold, these cms, and this gold can all be arguments. The count/nmss distinction is also often blurred since a singular count noun llke ca: may be used as a mass noun referring to the meat of the cat, and a mass noun like gold may be used as a singular count noun referring to a UNIT of gold or a KIND of gold (see e.g. Bach 1986). The boundmT between NP and CN is at best Ftr22Y.</Paragraph> <Paragraph position="1"> When we ~ to othm&quot; languages, the basis for the bi-level view vmisbes. In Japanese, for instance, neko 'cat' can be an argument on its own, and pronoun kam 'he' can be modified as in ano kate 'that he' and okas/na kate 'strange he'. In short, there is no basic syntactic diff~iew.e among count nouns, pronouns, and mass nouns (and no singular/plural distinction on a 'count' noun). All of them behave iJ~ plural and mass nouns in English. This supports a mono-level view of nominals, which we intend to captm~ with category N. Figure 3 shows the SG-templates relevant to the most general characterization of N in each language. SG-templates in the following illustrations are marked as follows: atomic templates SG-x (boldface), utility templates 9~SG-x, and substantive templates $SG-x.</Paragraph> <Paragraph position="2"> At the moat general level, the basic llomlnall ill Gezman (OE-N) and Arabic (AR-N) must be unsaturated because gcnitivc-inflectod Ns may take arguments. The basic nominals in Japanese (JA-N), English (EN-N), md French fiR-N), on the other hand, are basic categories that are salmated? In *_d,\]ition, all but JA-N inherit relevant AGR(eemant) templates (see below). Crucially, note that what 1oo~ like a reasonable characterization of N in each language actually consists of a particular selection from the common set of primitives.</Paragraph> <Paragraph position="3"> ARGUMENT and NON-ARGUMENT: We posit a pseudc~functiomd level of description in terms of ARG(ument) and NON-ARG for category N instead of the categozy=level distinction between NP and CN. ARG may function as an ~t alone, and NON-ARG cannot.</Paragraph> <Paragraph position="4"> 5Note that English possessive marker's is not treated as an inflection here.</Paragraph> <Paragraph position="5"> NON-ARG becomes ARG only by being combined with a certain modifier or by undergoing a semantic change (e.g massifying). In this view, the ARG/NON-ARG distinction is 'grounded on a complex intcraction of morphology, semantics, and syntax.</Paragraph> <Paragraph position="6"> In English and Germa~ singular count nouns (e.g. wee, Baum) are NON-ARG while plurals, mass (~ngu~) nouns, proper names, and pronouns are ARG. The NON-ARG nouns become 'complete' ARG nominals either by being modified with deteTmin~'s of by chmsing int~ mass nouns (typically changing an object reference into a property/substance mfe~nce, e.g., i uaed app/, /n my p/e.).deg In French, all forms of commo~ nouns (i.e. singul&, plural, and mass) me NON-ARG, in need of delcrminers to become ARC; (e.g~ $'a/~ *ar~ arbrea 'I saw tn~J'; *AmourlL' omour e~ delica~ 'Love is delkate').</Paragraph> <Paragraph position="7"> In Japanese, them ~e few NON-ARG nouns (e.g., kam 'person' (HONORIFIC)), which can become ARG with any modifier such as a relative clause or an adjective (e.g. ~mana tam 'free person (HON.)'3 In Arabic, the morphological distinction of nouns between a~rexzo vs.</Paragraph> <Paragraph position="8"> UNA~VeXED corresponds to NON-ARG md ARG statues, respectively, s For instance, the unmlnexed form q~.ma.~ CAT-DUAL NOM-UNANNEX 'tWO Ca~' may occur u mbject alone whereas the mnexed form q'.~a: CAT.DU~M ce~not. The latter must be modified with a noun-based modifier such as a genitive phrase, and this modifier must be unsnncxod (e.g. with rajulin MAN-ffeN.UNANNIDG q't~a: raju//n 'mAn's two cats'). These facts in Japanese mul Arabic show that the proposed fun~onal distinction for nominals is motivated independently from the syntaodc role of determiuen since ueithcr language has modifiers of categmy DET that we find in Engl_i~h; French, and Gennm (more discussed later).</Paragraph> <Paragraph position="9"> We realize that the ARG/NON-ARG distinction itself is not a final solution until fine-grained syntactic-romantic interdependence is fleshed out. For now, we simply posit pseudo-functional types ARG md NON-ARG, which me either changed or passed up within the nominal slructure: 9 $SG-ARG: \[result&quot; \[type: erg\]\] $SG-NON-ARG:\[result: \[type: non-&g\]\] Category NIN: Adnominal modif'~m (N-MODs) are now universally NIN (Le. a functor that combines with N and builds N). This includes both determiners and aUribulive modif'u:rs. Figure 4 shows the SG-templates for the basic N-MOD. Different kinds of N-MOD must then distinguish whether it takes one or two arguments and whether the resulting nominal with modification is ARG or NON-ARG. Each distinction is briefly illustrated below.</Paragraph> <Paragraph position="10"> Two kinds of Igenltlve: Genitive N-MOD functors may take different numbers of arguments crosslinsuist/cally. An inf~ted genitive nominal (e.g. GE: Marias, AR: rajulln 'man's') takes one, while a genitive 8dposition (e.g. EN: o)) takes two. The former is captured with SG-I~ONAI.~ENrrIVE-CASE-MOD, and the latter, with SG-PARTICLE-GENITIVE-CASE-MOD.</Paragraph> <Paragraph position="11"> see ~,ur, s.</Paragraph> <Paragraph position="12"> Non-universal determiner category: In the present ~roach, DET(enniner) is a modifim- type (including &ticks, demonstratives, quantifiers, numerals, and possessives) such that at least one of its members is needed for making an ARG nominal out of a NON-ARG. The fact that a nominal with a del~rmln~r is always ARG Iranslates into SG-DET inheriting from SG-ARG among others.</Paragraph> <Paragraph position="13"> DET is present in English, German, and French, but not in Japmese or Arabic (or Russian o~ Chinese).</Paragraph> <Paragraph position="14"> Demommnfive~ quanlifiers, numerals, and possessives in the latter lansuagea do not sham the syntactic function of DET. We suspect that the presence of DET is an areal property of western Eeropean lmgeaSes.</Paragraph> <Paragraph position="15"> The sublatticc in Figure 6 highlights two aspects of DET. One is the diff~,~.,ce between DET and ADJ(ective) in Engfish, German, and French with respect to the ARG status of the resulting nominal. DET always builds ARG cancelling whatever the type of the incoming nominal whereas ADJ passes the type of the incoming nominal to the top. The other is the place of demonslralives in relation to DET. Eve~ language has demonstratives encoding two or tluue degre~ of speaker proximity (e.g. JAPANESE: kono (close to the speaker), sow (close to the addressee), 61n implementation, this latter process may be triggered by a unary rule COUNT->MASS.</Paragraph> <Paragraph position="16"> 7They are assigned a NON-ARG category MN (for 'modified noun') separate from the ARG category N. Any modifier changes it into ARG.</Paragraph> <Paragraph position="17"> SA/mEX~ here means 'needing to be mmexed to a noun-based modifier', and UN~ means 'completed'.</Paragraph> <Paragraph position="18"> Th~ arc also called NONNUNATED ~ NUNATED fOl'l~, respectively, in Semitic linguistics (Aristar, personal communication).</Paragraph> <Paragraph position="19"> 9An intnging direction is shown in Kritka's (1987) categorial grammar t~ttmenL He assigns the singular count noun in English (i.e. our NON-ARG) m unsatnmted nominal category looking for its numerical value both in syntax and semantics. The sJSnificance of determiners is here as suppliers of numerical values. How this approach can be extended to cover the NON-ARG nominals in Arabic and JapAnese (which ale not in need of numerical values per se) remRin~ to be seen. Although it ma~s sense to see NON-ARG as a functor looking for more semantic determinaeon, implemeneng it would require a reduction rule for TWO FONc'roRs U30~O FOR EAC~ oTtm~ The current system would cause an infinite regression with such a rule.</Paragraph> <Paragraph position="20"> atomic templates %SG-HF.AD-FF.ATS-ARE-TOP-FEATS: <- passes the features of the second (result: \[feats: <1> element to the top elements: \[b: \[feats: 1\[ \])\]\]\] %SG.-FIRST-ARGUMENT: <- slot for the first argument \[result: \[elements: \[b: <1>\]\] arguments: \[first: \[result: 1\[ \]\]\]\]\] %SG-GET.-ORDER: <- passes the ORDER content of the first argument to the top \[result: \]order: \[\[<1>\]\] arguments: \[first: \[result: \[order: 1\[ \]\]\]\]\] $SG-MOD: <- for * category-constant functor MOD (see below) \[result: \[eat: 4\[ \] elements: \[s: \[index: <1>\] b: <3>\] order: limed: 1\[ \]\] \[head: 2\[ \]\]\] arguments: \[f'h'St: \[result: 3\[cat: <4> index: <2>\]\]\] inheritance of composite templates rest: #\]\]\] <- saturates the second argumen <- no more than two arguments soughl $SG...GENrnv~ <- assigns the genitive case featun \[result: \[elements: \[a: \[feats: \[case: genitive\]\]\]\]\] inheritance of composite templates $SG-N-MOD (above) $SG-CASE-MOD: <- for the general case-mod \[result: \[elements: \]a: \[cat: {'P N') <- P or N and ano (away from either)), but they belong to the class of determiners only ff the language has DET.</Paragraph> <Paragraph position="21"> Grammatical agreement (AGR) Two kinds of features are distinguished, linguistic features relevant to GRAMMATICAL A~'r (e.g. Frenc~ grammatical gender i~l~*~ table dega table' f.), and refexent fealm~s relevant to ~AC~ATXC A~Rmgdm~r (e.g. using s~ to refer to a female person; using appropriate numend classifiers fur counting objects in Japanese). The former is under aUribute AGR, and the latter is under FEATS. The N-internal gramma,~c~l agn:emunt (AGR) requires that certain features of the HEAD Nominal must agree with those of MOD. For instance, English has number agreement (e.g. th/s book, *tho~ book, *th/,v boo~).</Paragraph> <Paragraph position="22"> Among the five languages under consideration, all but Japanese have AGR.</Paragraph> <Paragraph position="23"> Although them is c~oss-linguistic variation in AGR features, it is not random (Moravcsik 1978). Table I sums up the N-intemai AGR features in the four languages. All AGR features go under atlribute AGR so that its presence simply corresponds to the inescoce of grmmnatical agreement in a language. EN-N, for instance, inherits the shared template for number agreement, and FR-N those for number and gender agreements. See below:.</Paragraph> <Paragraph position="24"> $SG-NBR-AGR: \[result&quot; \[agr:. \[nbr:. <I>\] elements: \[a: \[feats: \[nbr: IN\]\]\]\]\] $SG-GDR-AGR: \[result: \[ag~. \[g~ <1>\] etemmts: \[~ \[feats: \[g~ 11&quot;I\]\]\]\]\] Seperating AGR end FEATS enables us to cte.a~ SOtemplates that impose the most general agreement conslraint ~-g~miless of the precise content of agreement fea~. Three agreement templates produce the combined effect of N-intenml agreement conslrsint, SG-AGR, SG-AGR-ARGUMENTS, and the composite of the two, SG-AGR-WITH-ARGUMEN'I~. See Figure 7.</Paragraph> <Paragraph position="25"> The reenlrancies impose the strict identity of AGR features: (0 $SG-AGR--betwem the topmost structure and the dcmmt that the graph is defined for, (fi) $SG-AGR-ARGUMENTS---between the topmost structure and the first argument, and (iii) $SG-AGR-WITH-ARGUMENTS--among all the three. (0 goes into ALL NOMINALS, pussing the Dominql's AGR featams to the top level This is because the AGR features must always be available at the top level of a nominal so that they can be used when the nominal is further modified. (ii) goes into ADNO~AL MODn~mRS, passing the head nominai's AGR realtors to the top leveL (ih~ goes into ONLY THOSE</Paragraph> </Section> <Section position="4" start_page="199" end_page="201" type="metho"> <SectionTitle> ADNOMINAL MODwle.gS SUBJECT TO THB AG~ CONS'IRAINI** </SectionTitle> <Paragraph position="0"> for instance, demomtratives (e.g. these) but not attributive adjectives (e.g. sma//) in English, and both demonstratives and adjectives in French (see this diff~ce in the above inberitance).</Paragraph> <Paragraph position="1"> This is an example where a better language-specific treatment is obtained from the gnunmar-sharing perspective. If only English is handled, one may simply force the identity of NBR features amidst all kinds of other featmes, but in the light of eruss-linguistic variation and invsrisnts, it lends itself naturally to separating out two kinds of features that correspond to diff~t semantic intcqnetation processes.</Paragraph> <Paragraph position="2"> Category constancy and word order typology In connecting word order typology and categoriai grnmm~r~ we have benefited from work of Grcenberg (1966), Lelmumn (1973), Vennemann (1974, 1976, 1981), Kecnma (1979), Flynn (1982), and Hawkins (1984).</Paragraph> <Paragraph position="3"> Amon 8 these, we have a f'h-st-cut implementation of Vamemmm's (1981) and Plyun's (1982) view that the functor types based on CATEOORY CONSTANCY have a significant relation to the default word order of a language. A functor is c^Teoo~Y.COm-T~aCr ff it builds the same catego~ as its argum~t(s). It is CATEGORY.NON-CONSTANT if it builds a different category from its m-gument(s). These notions ~e also called m~xJrt, mc md ~x~c, respectively, by Ber-Hillel (1953), and are crucially used in lqyma's high-level word order convention s~. The definitiom of the notions MOD (modifier), HEAD (head), FN (run.ion), and ARG (argument) follow:.</Paragraph> <Paragraph position="4"> * MOD is a categm'y-comtant functor (XIX) that combines with HEAD (X). (see above for SGMOB) null * FN is a category-non-comtant functor (YIX) that combines with ARG (X).</Paragraph> <Paragraph position="5"> eatm~oz~, aat~oz~,</Paragraph> <Paragraph position="7"> as follows: Every language has ~posrnoN-s (prepositions and postpositions), universally a category-non-constant functor PPIN. A postpositionai laaguage (i.e. a language that uses only or predominantly postpositions) then belongs to TYPE 1 (ARG < FN), and a prepositional language belongs to TYPE 2 (FN < ARG). in the present case, EN, G~ ~ and AR are propositional while JA is postpositiuneL The default MOD order is most faithfully observed in Arabic (HEAD < MOD) and Japanese (MOD < HEAD), with few exceptions. The three European languages, however, observe the default order only with 'heavier' (i J:. phrasal or clausal) modifiers, namely, genitives, ppmodifiers, and relative clauses. Lex/cal modifiers, including numerals, demonslratives, and adjectives (more or less), go in the opposite ordering. The exceptionally ordered MODs of the five languages revealed en implk:ational chain amnng modifiers: Numerals < Demonstratives < Adjectives < Genitives .: Relative clauses. Exceptional order was found with those MODs s~arting from the left-end of this hierarchy: JA: marked use of Numerals, AR: enmarked use of Numerals and Demonslratives, FR: Numerals, Demonstratives, and used of Adjectlve~ EN&GE: Numerals, Demomlrafives, and Adjectives. The generalization is that a non-default order for a modifier type x implies the now default order for other types located to the LeFr of x in the given chain. WI~ we found mppo~ the general implicational hierm~hy that Hawkin~ (1984) found in his cross-linguistic study. We can ~ maintain, therefin'e, that there is such a thing as the default .o~ with a qualification that it maybe oven'idden by non-random, subclaasea. In our current implementation, we simply assign another category MOD2 on those 'exceptional' modifiers in order to free them from the general order conslraint on MOD, which we hope to improve in the future. 10 Potential problems and solutions There are two potential problems in m effort to develop a shared grammar as described be~ One is the need for serious cooperation amang the developers. A small change in shared templates can always affect language-specific templmns that someoue else is workln~ on. The other problem is the sheer complexity of the inheritance lattice. Both problems can be most cffcctively reduc~_d by a sophisticated edits tooL Conclusions and future prospects We have shown a specific implementation of grammar sharin8 using graph unification by inheritance. Although the case discussed covers only simple nominals in five languages, we believe that the fundamental process that we GRAMMATICAL ATOMIZATION will remain crucial in developing a shared grammar of any sU'uctural complexity a~l linguistic coverage. The specif~ merits of this process is that (a) it tends to prevent the grammar writer from implementing treatments that work only for a language or a language type, and that (b) it pmvidas insights as to how certain conflated properties in a languase actually mnsist of smaller independent pros. In the end, when a prototype shared grammar anains a reasonable scale, we hope to verify the prediction that it will facilitate adding coverage for new languages.</Paragraph> <Paragraph position="8"> The purpose of this wo~ at MCC was to demonstrate the feasibility of a shared syn~ rule base for dissimilar languages. We only assumed that languages are used to . convey information contents that can be represented in a common knowledge base. As the next step, therefore, we have chosen to connect syntax with 'deeper' levels of information pmces~in~ (i.e. sern*.tlcs, discourse, and knowledge base) rather them continuing to increase the syntactic coverage alone. Our current effort is on developing a blackboard-like system for controlling various knowledge sources (i.e. morphology, syntax, semantics, discourse, and a commmutense knowledge base (MCC's CYC, Lanat and Feigenhaum 1987)). In the future, we hope to see a shared grammar integrated in a full-blown interface tool for man-machine commuuical/on.</Paragraph> </Section> class="xml-element"></Paper>