File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/w99-0509_metho.xml
Size: 11,424 bytes
Last Modified: 2025-10-06 14:15:33
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0509"> <Title>An Overt Semantics with a Machine-guided Approach for Robust LKBs</Title> <Section position="4" start_page="63" end_page="64" type="metho"> <SectionTitle> 3 Overt Semantics to Predict Syntax: A Machine-guided </SectionTitle> <Paragraph position="0"> Approach Mappings between semantic roles and syntacUc complements axe defined via a mapping (a rule) These mappings can be defined for large sub-classes of lexical entries For example, the rule Atl;-Pred-Adj creates an entry which accepts in the semantic feature a concept from the subtree of ATTRIBUTE or an ATTITUDE and accepts attributive (e g safe car) and predlcat,ve uses (e g the car zs sa\]e) In the case of an adjecUve mapped to a RELATION (e g MENTAL-OBJECT-RELATION) the preferred rule would be Att-Adj generating an attributive reading (e g, dental practzce), and not (~the practice ~s dental) By selecting the appropriate mapping for classes of entries, it is possible to hide the mapping from the acqulrer since these mappings are defined in a lexlcal class, not m an instance As defined by an acqulrer, an entry looks as follows \[key &quot;safe&quot;, syn Att-Pred-Adj, sem \[name Safety-Attrlbute, range Safe\]\], During compilation of the dictionary, the Att-Pred-Ad 3 label is replaced with its definition and makes explicit the co-reference between the sub-categorization and the semantics So far we have developed for the English lexicon about twenty syntactic patterns whmh apply to a large number of semantic frames In the case of adjectives, we have 3 rules, one for attributive adjectives, another one for predicative adjectives, and a third one for attributive adjective used predicatively In the case of nouns, we have developed four pat- null We presented above the labels of subcategorlzation patterns as they appear at acqms~tlon time At processing time, there is no difference between Obll and Ob12, which are both of type Oblique Our machine-graded approach helps the acqmrer to select a rule as it only presents the relevant ones for a specffic semantic type For instance, in the case of a lexeme mapped to an OBJECT no rules having obliques will be presented to the acqmrer as de- null The table above should be read as follows the first column provides type examples for nouns, the second column (semantics) provides the list of semantlc types that a noun can be, Obj (Object), Prop (Property) and Event, the third column (subcategonzation) presents all subcategorizations a noun can subcategorize for, the fourth column (lexlcal class) concatenates the semantics and the subcategorization For instance EventNObjObllObjEventOb12Opt' is the lexmal, class of nouns.which are of.type 'Event' and therefore subcategorlze for two obhques (Obl) - the former must be Obj whereas the latter can be either Obj or Event These Obl can be optional (Opt) Acqmrers may specify the preposition (head of the oblique or preposlUonal phrase) For Instance, in the case of lather, once an acqulrer has mapped the word to the concept ' Father&quot; which is a Prop (Property) the acquisition tool presents the subcategorizatmn NObllOpt This allows the acquirer to select wluch prepositlon(s) can go with the range of Father (in this case &quot;of&quot; will be selected) This Information is important in generation For generation, one must specify, at acquisition tlme, whether or not one can say the bombing o/Iraq , the bombing of Iraq by the US ~ the bombing by the US It also helps in word sense dxsamblguation In the case of verbs, one can also define lexico-syntactic classes for different semantic classes For instance, in the case of ASSERTIVEACT the lexemes mapped to it will accept a comp clause (e g he A sazd (that,) he would come) One class of aspectuals subcategonzes for nps (e g I started a new book), xcomps (e g I started reading/Tinting a new book), and accepts the intransitive alternation when the grammatical object is of type Event (e g the surgery started very late) 3</Paragraph> </Section> <Section position="5" start_page="64" end_page="64" type="metho"> <SectionTitle> 4 Propagation of Lexicons </SectionTitle> <Paragraph position="0"> In this section, we briefly discuss how to extend a lexmon using denvatlonal morphology, and off the shelf resources such as WordNet (Miller, 1990) to propagate the English lexicon with synonyms, and Levm's database of subcategonzatlons and alternations for Enghsh verbs (Levm, 1993) to encode syntactic information m the verb entries 4</Paragraph> <Section position="1" start_page="64" end_page="64" type="sub_section"> <SectionTitle> 4.1 Morpho-semantics for Derlvatlonal Morphology </SectionTitle> <Paragraph position="0"> We refer the reader to Vmgas et al (1996) for the details on this type of acquisition and theoretical background of Lexlcal Rules (LRs) To sketch this operation briefly, applying morpho semantic LRs to the entry for the Spanish verb comprar (buy), our acquisition system produced automatically 26 new entries (comprador-N1 (buyer), comprable-Ad\] (buyable), etc) This includes creating new syntax, semantics and syntax-semantm mappings with correct subcategomzations and also the right semantics For instance, the lexmal entry for comprable will have the subcategonzatlon for predicative and attributive adjectives and the semantics adds the attribute FEA-SIBILITYATTRIBUTE to the basic meaning BuY of comprar (Vmgas et al, 1996) describes about 100 morpho-semantlc LRs, which were applied to 1056 verb citation forms with 1,263 senses among them The rules helped acquire an average of 26 candidate new ~ntrms per verb sense This produced a total of 31,680 candidate entries, with an average of over 90% and 85% correctness in the assignment of syntax and semantics respectively LRs constitute a powerful tool to extend a core lexicon from a monohngual viewpoint We present other ways of extending lexicons, from monohngual (next paragraph) and multlhngual (Section 5) perspectives</Paragraph> </Section> </Section> <Section position="6" start_page="64" end_page="64" type="metho"> <SectionTitle> 4 2 Using WordNet </SectionTitle> <Paragraph position="0"> WordNet has been used as follows We extracted the synsets assooated with a lexeme using fuzzy string matches between, on the one hand, the value of the ontological concept (e g, DESIRE), its defimtlon (e g, for DESIRE &quot;to want something&quot;) and the concept and definition of ~ts corresponding ISA concept (e g, INTEND) and, on the other hand, the direct hypernyms and hyponyms for the lexeme m expect, mapped to the ontological concept DESIRE our algorithm only kept one synset hope, expect, trust, deswe for expect and the following synsets for Its hypernyms wzsh, des:re, want The output of our automatlc procedure and manual filtering Is illustrated below for the ontological concept DESIRE, along with the synonyms from WordNet belonging to the same ontological class We should mention that this step also Involved some manual filtering by acqulrers We used a machine-guided mode to help the acqulrer in this task This type of filtering was done very quickly, mainly due to the fact that WordNet is orgamzed on a semantic basis</Paragraph> <Section position="1" start_page="64" end_page="64" type="sub_section"> <SectionTitle> 4.3 Using Levin's DB </SectionTitle> <Paragraph position="0"> One of the major problems m using Levm's database was filtering out homonyms, as classes in Levm's database are defined on the basis of the same subcategonzatlon pattern (as seen in alternations) and not on a semantic basis, as shown by many researchers s The advantage of our approach is that ~t is semantic-based, this allows us to organize verbs into true (frame-based) semantic classes, with their associated sets of subcategonzatlons Therefore, we can pledlct that all velbs belonging to a particular semantra class Will have the Same syntactic behavior For instance, if one considers the serhantm class of aspectual verbs which selects a theme of type Event, e g begin, continue, finzsh, then one can minimally associate to any verb belonging to this semantic class the following subcategonzatlons (a) NP-V-NP m John began h:s homework, (b) NP-V-XCOMP John began to work//workzng Note that the reverse is not necessarily true verbs which accept (a) and (b) are not necessarily aspectuals, e g forget in I forgot the key or I forgot to brzng the key</Paragraph> </Section> </Section> <Section position="7" start_page="64" end_page="65" type="metho"> <SectionTitle> 5 Applicability to Other Languages </SectionTitle> <Paragraph position="0"> In thls sectlon,-we briefly address what can be generahzed to multiple languages The methodologms described here are part of what is needed to build a multi-purpose LKB while keeping the costs of acqmsltmn as low as possible</Paragraph> <Section position="1" start_page="65" end_page="65" type="sub_section"> <SectionTitle> 5.1 Semantic Multillnguahty </SectionTitle> <Paragraph position="0"> By mapping lexemes to concepts, it is possible to create lexicons for dufferent languages, at a mmlmum cost, once a core lexicon has been acquired Th~s task can be further accelerated if one has access to blhngual d~ctmnames to semi-automate the translatmn task Finally, if one has access to a rlch structured ontology (as is the case in Mlkrokosmos) then dynamic procedures (e g, generalization, speciahzatmn) can help the acqulrer in &quot;filling&quot; the gap m the case of lexlco-semantm mismatches (e g, cook, bake ~ cuwe)</Paragraph> </Section> <Section position="2" start_page="65" end_page="65" type="sub_section"> <SectionTitle> 5.2 More Related Languag e Multflinguahty: Morpho:semantics </SectionTitle> <Paragraph position="0"> All the LRs (e g, LR2agent-o\]) developed for Spanlsh can be used to extend other languages, even unrelated ones, in other words, these rules are language mdependent The morpho-semant,c aspect of the LRs is, however, specific to particular languages But, in order to benefit from the work done on morpho-semantlc LRs, we separated the assignmeat of affixes from the assignment of LRs In other words, if m Spanish LR~agent-of m ass~ghed to say the suffix -dot, by translating suffixes between languages, (-dot -+ -cur in French), the French lexlcon can be extended m the same way (comprador --+ acheteur) Again, this work will necesmtate some manual checking, because of some overgeneration, which cannot be accepted for generation But overall, one can use the same methodology, the same LRs and engine to produce new entries</Paragraph> </Section> <Section position="3" start_page="65" end_page="65" type="sub_section"> <SectionTitle> 5.3 Even More Related Language MultihnguahtY </SectionTitle> <Paragraph position="0"> The subcategonzatlons attached to a lexeme have an even more idiosyncratic behavmr than lexlcal LRs But here agam, the rules we developed can be applied at least to family-related languages, and then filtered out by a human For mstance, the Spanish word comer has the pattern np-v-np associated to it (e g Juan come una pera), so this same pattern will be attached to the translatmn of comer (eat) as m (Juan eats a pear) However, gomg from Spanish to English, one misses all the alternatmns (Lewn, 1993) not common in Spanish such as John gave Mary a book</Paragraph> </Section> </Section> class="xml-element"></Paper>