File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2131_metho.xml
Size: 12,934 bytes
Last Modified: 2025-10-06 14:13:43
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2131"> <Title>HPSG LEXICON WITHOUT I,EXICAL RULES</Title> <Section position="3" start_page="0" end_page="823" type="metho"> <SectionTitle> 2. TIlE HPSG STANDARD </SectionTitle> <Paragraph position="0"> As the basic type of object of linguistic description HPSG adopts the sign 2, and builds a subsumption hierarchy of its subtypes, whose top level is sketched in (1).</Paragraph> <Paragraph position="1"> sign phrasal lexical /\ major milxo~ The idea behind this hierarchy is that each node is associated with bundles of features, which are inherited by all nodes directly or indirectly subordinated to this node, by means of which a lot of redundant stipulations of features can be removed. There are, however, two points about the hierarchy fi'om (1) which may be worth reconsidering. First, it is the fact that linguistics in general is concerned with a broader class of objects thau signs - e.g., phonemes, morphs etc. are objects having no semantics, i.e. they ~- by definition cannot fall into the class of signs, and if ItPSG aims at becoining a full linguistic theory, these objects have to be also taken into consideration. Second, the hierarchy reflects only the syntactically motivated division of lexic~d signs into major and minor ones, but tire other possible - and in fact more standard - divisions into autosemantic vs. synsemantic 3 words and productive vs. non-productive word classes are missing.</Paragraph> <Paragraph position="2"> Apart from this, tlPSG stipulates tire lexicon to be a full-fornl one, i.e. a lexicon where all word-forms of a language occur as actual items in the lexicon (leaf nodes in the hierarchy). This is also an approach which hardly finds a parallel in the standard lexicographic practice (even for lan~ guages with that poor morphology as English has), and in addition an approach which at least on the explanatory level enforces the necessity of lexical (redundancy) rules - which, on the other hand, do not fit into the general image of HPSG at all.</Paragraph> <Paragraph position="3"> l'rovided that the lexicon is viewed from a &quot;declarative&quot; perspective 4, the lexical rules express static rehttionships (such as inflection) between members of word classes. These relationships are, however, not needed for the fimctioning of the system (since all word forms are available anyway) - they have to exist only because without them the lexicon would be linguistically clearly inadequate.</Paragraph> <Paragraph position="4"> The other option, namely the &quot;procedural&quot; perspective, seems to be even worse, since it breaks the overall &quot;declarative&quot; strategy of HPSG; in particular, it enforces inequality of status of different lexical items (some being &quot;basic&quot; and some beiug only &quot;derived&quot; - in the very procedural sense of the word).</Paragraph> </Section> <Section position="4" start_page="823" end_page="825" type="metho"> <SectionTitle> 3. RELATIONAL CONSTRAINTS AS ALTERNATIVE </SectionTitle> <Paragraph position="0"> Based on the preceding facts and (first of all) on the standard linguistic practice, it is possible to propose an alternative hierarchy of objects a linguistic theory should deal with, whose &quot;top part&quot; can be sketched as in the crossing hierarchy given in (2), where appurtenance of several sub-types of a type to classification according to the same key is marked off by an arc connecting the respective branches of the hierarchy - in particular, two divisions according to different keys are to be observed for the class &quot;linguistic object&quot;, one corresponding to a &quot;functional&quot; perspective (the division into signs and non-signs), the other one corresponding to a &quot;formalisfic&quot; perspective 5. syn-s-wd auto-s-m syn-s-m Among members of this hierarchy, certain parallels are to be observed. In particular, it is worth to observe the parallel between the construction of phrases (which consist of words) and the construction of words (consisting of morphemes and - in case of composed words - of bases, i.e.</Paragraph> <Paragraph position="1"> combinations of autosemantic morphemes). In an HPSG-like notation, the &quot;consisting of&quot; can be approximated by the feature structure (3), with obvious meaning of the relational constraints phon_consists_of and sem_consists_of.</Paragraph> <Paragraph position="2"> (3) phon: 1 rr :3 4 sem : 2 phon: 3 rphdegn: constituents: LL~ em : ~ Lsem : ~ &quot;.. The parallel, however, does not go much farther than that. In particular, the differences are that: a. phrases consist of other phrases, but words (at least as a rule) do not consist of other words b. for a word, at most one among its constituents, Re base, is autosemantic (at least as a rule, again) Taking into consideration this as well as still other factors, one can build up in more detail the top of the hierarchy of linguistic objects as in (4) (the &quot;lexicon&quot; part of the hierarchy being marked off by the interrupted line).</Paragraph> <Paragraph position="3"> (4) sign . .. *&quot; ~ word .~* / ~ s s inflected non-inflected ~* / \ ,- *, auto syn ~ nom adjec verbal adverb ~ semtic semtic . s &quot; inal tival ~-~ ial ~ ~ ~ ~noun pe'rs adjec ~ad~ maln aux ..... ~ pron tire pron verb verb The class which is of particular interest in connection with the effort of removing the deficiencies of the lexical rules is the class inflected. By the very fact that this class - by definition - subsumes all inflected words and by parallel to (3), this class should be associated with the constrained feature structure (5).</Paragraph> <Paragraph position="4"> (5) phon: 1 prefixes:{ \[phon. 3\], \[phon: 5\], ..., \[phon: 2n+l\]} base: 2 suffixes:{ \[phon: 4\], \[phon: 6\], .... \[phon: 2k+2\] & phon consists of(l, {3,5, ...,2n+i},2, {4,6, ...,2k+2}) The definition of the relational constraint in (5) is actually the formal definition of inflection in the respective language 6. In particular, this allows for expressing the part-of-speech independent regularities of inflection on one slot of the language model rather than repeatedly as with standard lexical rules (an example of such regularity might be the infix -e- in English words &quot;goes&quot; and &quot;potatoes&quot;). Having viewed the &quot;top&quot; part of the lexical hierarchy, let us turn our attention now to its bottom and middle.</Paragraph> <Paragraph position="5"> The leaf elements of the hierarchy are lexemes - feature bundles encoding the most idiosyncratic information about a particular word. As a rule, a lexeme consists of the base Cr oot&quot;) of the word and of the semar tic information associated with this base.</Paragraph> <Paragraph position="6"> In the typical case, a lexeme of an inflected word has two immediate superclasses.</Paragraph> <Paragraph position="7"> These correspond to the cross-classification of (inflected) words according to, first, subcategorization requirements of the respective word (as in standard HPSG), and, second, according to the inflection class of the word. The respective sub-categorization class then assigns to the word those subcategorization requirements which are not influenced by its morphology, as well as basic information about the nature of subcategorization requirements which do undergo changes due to the morphological form of the word. The inflection class, then, in the form of a relational constraint ties together the morphological features of the word with the respective affix(es), the constraint being in fact the (traditional) inflection table.</Paragraph> <Paragraph position="8"> Finally, when these two classes meet upper in the hierarchy, the class where they meet defines, again via a relational constraint, the exact relation between the subcategorization and the morphological form of the word. The example (6) showing the verb &quot;accuse&quot; might clarify this. In the case of totally inegularly inflected words, it would be of course inadequate to postulate a class expressing the inflection of this very one word only. The simplest solution in this case is to associate the respective relational constraint expressing inflection directly with the lexeme, cf. example (7) (technical variations concerning tile base and affixes are possible, but insubstantial).</Paragraph> </Section> <Section position="5" start_page="825" end_page="825" type="metho"> <SectionTitle> 4. CONCLUSIONS </SectionTitle> <Paragraph position="0"> In this paper I tried to show how relational constraints can take over the bulk of (if not all) the work which is in standard HPSG assigned to the operation of lexical rules.</Paragraph> <Paragraph position="1"> Such an approach has at least the following two major advantages: 1. it is much more straightforwardly related to the standard lexicographic and morphological usage, a matter which is of a remarkable theoretical and even greater practical importance 2. the disadvantages of the lexical rules approach to lexicon mentioned in the first part of this paper (either inherent procedurality of the description or inherent redundancies in the description) disappear with the replacement of the (also formally poorly understood) lexical rules by the standard machinery of relational constraints.</Paragraph> <Paragraph position="2"> As a minor point in favour of the sketched approach it might be noted that it also easily avoids such counterintuitive stipulations as postulating the division of English verbs into classes &quot;auxiliary&quot;, &quot;main&quot; and a singleton class &quot;haveas-abstract-verb&quot; (cf. Pollard and Sag,87, p.215) which was necessary due to the fact that both the main verb &quot;have&quot; and the auxiliary &quot;have&quot; are of the same - idiosyncratic - inflection class. All that is necessary to say on the given approach is that the two &quot;have&quot;s differ as to their contents and as to their subcategorization (i.e. they constitute two lexemes at the bottom of the lexical hierarchy), while simultaneously belonging to the same conjugation class, cf. (8).</Paragraph> <Paragraph position="3"> &quot;conjugation of have .... main verb .... aux verb&quot; / \[content: \[reln: possession\] \] \[content: \[reln: <empty>\] \] As for applications, a large computational lexicon of Czech, based on ideas presented in this paper, is currently under preparation in the framework of a project aiming at the development of a prototype of a grammar-checker. Tiffs project is being carried out jointly with the Charles University in Prague.</Paragraph> </Section> <Section position="6" start_page="825" end_page="825" type="metho"> <SectionTitle> FOOTNOTES </SectionTitle> <Paragraph position="0"> I Cf. also the following complete quotation of Sect 8.3.</Paragraph> <Paragraph position="1"> Suggestions for further reading from (Pollard & Sag,87) p.</Paragraph> <Paragraph position="2"> 218: The key ideas in this chapter (i.e. &quot;The Lexical Hierarchy and Lexical Rules&quot;) arise not from previous linguistic studies, but rather from algebraic approaches to datatype theory and the semantics of programming such as ...; similar ideas are widely employed in frame-based knowledge representation systems (see ...).</Paragraph> <Paragraph position="3"> l just want to point out that hardly anyone wouldprotest if &quot;the key ideas ... arise not ONLY from previous linguistic studies, but ALSO from ...&quot;. tlowever, the current wording explicitly excludes any reference to previous linguistic work.</Paragraph> <Paragraph position="4"> 21n accordance with the tradition going back at least to de Saussure (Saussure,15), a sign is an object relating a (phonetic)form and a (semantic) content.</Paragraph> <Paragraph position="5"> 31n different terminology, &quot;content vs. function&quot;.</Paragraph> </Section> <Section position="7" start_page="825" end_page="825" type="metho"> <SectionTitle> 4 Cf. (Pollard & Sag,87), p.209 ff. 5 Though (2) corre,~ponds rather well to the traditional pic~ </SectionTitle> <Paragraph position="0"> ture, it is worth remarking that a more simple hierarchy can be achieved by making phrase, word, word-base and morpheme subtypes of sign (in a division different from the autosemantic/synsemantic opposition) while making morph and smaller units sub-ciasses of non-sign. For our purpose, this point is, however, immaterial.</Paragraph> <Paragraph position="1"> 6 Taken strictly, as it is, the definition does not take into consideration introflective languages. This, however, can be remedied simply by reformulating the constraint accorddeg ingly.</Paragraph> </Section> class="xml-element"></Paper>