File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-1402_metho.xml
Size: 20,419 bytes
Last Modified: 2025-10-06 14:08:35
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1402"> <Title>Encoding information on metaphoric expressions in WordNet-like resources*</Title> <Section position="4" start_page="3" end_page="4" type="metho"> <SectionTitle> 3 Metaphoric expressions in EWN-IWN </SectionTitle> <Paragraph position="0"> When looking for word sense distinctions within different lexical resources we see that these vary widely across resources.</Paragraph> <Section position="1" start_page="4" end_page="4" type="sub_section"> <SectionTitle> Different dictionaries </SectionTitle> <Paragraph position="0"> distinguish among different senses of words in a sort of arbitrary way, since they are strongly influenced by the purpose of the resource (the target audience), and have different editorial philosophies with respect to 'lumping vs. splitting' of senses (Atkins, 1993; Kilgarriff, 1997). Dictionaries normally contain distinctions between 'literal' vs.</Paragraph> <Paragraph position="1"> 'figurative' meanings within a lexical entry. However, such information is in general, at best, 'incomplete': null 1. information on metaphoric uses is not systematic in many sources, and different sources contain different information; 2. when information on metaphoric sense extensions is present, there is generally no clear indication of the connection between the 'basic' and the 'extended' senses; 3. data which could help to identify novel metaphoric expressions are not provided.</Paragraph> <Paragraph position="2"> EWN first and IWN then were built using as source data dictionaries available in machine-readable form, thus they contain inconsistencies and shortage of data partly inherited from dictionaries, in particular with respect to figurative language. Consider, for instance, the verb andare (to go): it has 17 senses in IWN (reported below), two of which are strictly motion senses while the others are figurative senses somehow derived from the two basic ones, with different degrees of proximity to the literal senses.</Paragraph> <Paragraph position="3"> We assume some sort of intuitive pre-theoretical notion of word-sense, which we are well aware that can be disputed. Much research has been devoted at the issues of what a word-sense is and if word-senses 'exist' at all and should be considered as the basic units of the lexicon. Although we agree with views according to which &quot;word senses exist only relative to a task&quot; (Kilgarriff, 1997: 1), and are at the same time appealed by proposals for 'coarse coding' (Harris, 1994), we still believe that a WN-like structure, taking the concepts and the synsets referring to them as the 'building blocks' of the (mental) lexicon, is both appropriate as a representation of lexical knowledge (with the basic idea of a net linking the concepts) and can be used as a resource for NLP, provided that the possible uses and actual limits of such a resource are kept clear.</Paragraph> </Section> <Section position="2" start_page="4" end_page="4" type="sub_section"> <SectionTitle> Synset Definition </SectionTitle> <Paragraph position="0"> {andare 1, muovere 5, gire 1, ire 1} muoversi, spostarsi, senza meta o senza che la meta sia indicata (to move or to change one's place without a goal, or without a specified</Paragraph> <Paragraph position="2"> muoversi, spostarsi verso una meta piu o meno chiaramente definita (to move, to change one's place toward a more or less clearly defined goal) {andare 3} essere collocato, essere destinato ad essere messo in un dato luogo (to be located or to be intended to be put in a specific place) {andare 4} sentirsi in un certo modo (to feel in a certain way) {andare 5} trasformarsi (to transform - reflexive) {andare 6, morire 1, dipartirsi 2, ...} cessare di vivere (to die, to cease living) {andare 7, correre 12} di moneta e simili, avere corso legale (of money, to be legal tender) {andare 8} dover essere (to have to be (done)) {andare 9, calzare 1} essere adatto, calzare (to fit (s.o.)) {andare 10, piacere 1, garbare 1, ...} essere gradito, piacevole (to like) {andare 11, precedere 2, progredire 2, ...} andare avanti, progredire in qualcosa (fig.) (to go ahead, to progress in something (figura-</Paragraph> <Paragraph position="4"> venire meno, dileguarsi (to fade away, to disappear) {andare 13} continuare in un'azione (seguito da un gerundio) (to continue doing something) {andare 14, estendersi 2, arrivare 6} estendersi fino (to extend to) {andare 15, dare 17, condurre 1, ...} dare accesso (to lead into) {andare 16, funzionare 1} adempiere la propria funzione (to work, to function) {andare 17, muoversi 4, spostarsi 2} (fig.) spostarsi (figurative - to move, to change one's opinion, etc.) Senses 5, 6, 11, 12, 13, 14, 15 16 and 17 are clearly more directly derived from the two basic senses (either the first or the second): e.g., senses 5, 6, 11, 12 can all be linked to the general 'CHANGE IS MOTION' conceptual metaphor; sense 13 to the</Paragraph> </Section> </Section> <Section position="5" start_page="4" end_page="6" type="metho"> <SectionTitle> 'ACTION IS MOTION' metaphor, etc. The re- </SectionTitle> <Paragraph position="0"> maining senses seem also connected with the motion senses, although in a less direct way.</Paragraph> <Paragraph position="1"> Only two of the metaphoric senses are marked as 'figurative' and no indication is provided of the connection between each metaphoric sense and the basic literal sense it is derived from. Moreover, if we take into consideration dictionaries of Italian like Zingarelli or Garzanti, we find different sense definitions for andare: Zingarelli has 13 senses (with some information on connection of senses) and Garzanti has 11 (with no indication of sense connections). Finally, no information is provided, either in IWN or in the other resources, which could be used to automatically disambiguate novel metaphoric uses of the verb.</Paragraph> <Paragraph position="2"> If we then look for occurrences of andare in a corpus of Italian (the PAROLE corpus, partly available at http://www.ilc.cnr.it/pisystem/demo/ demo_dbt/demo_corpus/index.htm; cf. Goggi et al., 2000), we find occurrences of the verb which are hardly linked to the senses provided in our resources. Consider just two examples taken from this corpus: a. Borg e gia tornato e se n`e gia andato in un mondo tutto suo (Borg has already come back and he has gone into his own world) b. Altri sono andati con la memoria alle immagini televisive della guerra del Golfo (Others went with their memory to the television images of the Gulf war).</Paragraph> <Paragraph position="3"> These two uses of the verb are quite frequent in spoken language, however they are not accounted for in the resources considered.</Paragraph> <Paragraph position="4"> When comparing corpus occurrences of words with information encoded in IWN, as in other lexical resources, one normally sees that there is a surprisingly high frequency of figurative senses in E.g., sense 8 is found in sentences like &quot;Questo lavoro va fatto subito&quot; (This work has to be done immediately), where andare expresses a duty. We might suppose the existence of a conceptual metaphor like &quot;TO FULFIL ONE'S DUTY IS TO</Paragraph> </Section> <Section position="6" start_page="6" end_page="6" type="metho"> <SectionTitle> MOVE TO A GOAL&quot;. This could be linked to a more general &quot;ACCOMPLISHING SOMETHING IS REACHING A </SectionTitle> <Paragraph position="0"> GOAL&quot; metaphor, again connected with the &quot;ACTION IS MOTION&quot; metaphor. Of course, this analysis needs to be deepened; in particular, among other cases, one should also take into consideration the use of venire (to come), which in its basic sense indicates opposite direction with respect to andare, in sentences like &quot;Questo lavoro viene fatto regolarmente&quot; (This work is done regularly).</Paragraph> <Paragraph position="1"> Lo Zingarelli 1999 in CD-ROM, 1999, Zanichelli, Bologna; Il nuovo dizionario italiano Garzanti, 1991, Garzanti, Milano. real texts but most of these senses are not described in such resources (cf. Nimb and Sanford Pedersen, 2000, for data identified within the SIMPLE EC project and the solutions proposed in that context). Alonge and Castelli (2002a) take into account corpus occurrences of the verb colpire (to hit/to strike/to shoot) and the noun colpo (blow/stroke/shot), and compare the results of this analysis with data found for these words within IWN, concluding that IWN lacks precise information on frequent metaphoric uses of colpire and colpo. Indeed, the data provided show that by analyzing a large general corpus various metaphoric expressions are clearly distinguishable which are not (consistently) identified in IWN or in other resources. Thus, how should these figurative senses be accounted for in a WN-like resource (in particular, in EWN/IWN)? Moreover, how should novel, potential uses of words be dealt with in a resource such IWN? We believe that the ability to cope with these issues cannot be set aside if IWN, or similar resources, has to be used for word sense disambiguation of 'real' texts.</Paragraph> </Section> <Section position="7" start_page="6" end_page="8" type="metho"> <SectionTitle> 4 Proposals for metaphors encoding in IWN/EWN </SectionTitle> <Paragraph position="0"> As already mentioned, by analyzing a large general corpus various well-established metaphoric expressions are clearly distinguishable which are not consistently encoded in IWN or in other resources.</Paragraph> <Paragraph position="1"> Since the necessity of adding corpora as sources for building computational lexicons is probably unquestionable, our main point is that one should deal with these issues by adopting a well established and generally accepted theoretical framework like that proposed by Lakoff and Johnson (1980) and Lakoff (1993), within which a large system of conventional conceptual metaphors has been described. By adopting that perspective many subtle, but relevant, differences may be highlighted in a principled way (cf. Alonge and Castelli, 2002a; 2002b). These should be encoded in EWN/IWN at the synset level to account for already well established word figurative senses. Of course, no lexical resource will probably ever be able to exhaustively account for the phenomenon which Cruse (1986) termed modulation, determining that &quot;a single sense can be modified in an unlimited number of ways for different contexts, each context emphasizing certain semantic traits, and obscuring and suppressing others&quot; (Cruse, 1986: 52). However, each resource should be designed so to be as complete and coherent as possible. null A more central issue to be tackled, however, is that of how to encode information on the systematic nature of conceptual metaphors, determining the possibility to produce and/or understand novel metaphoric uses of words. When we understand novel metaphoric expressions we make reference to a system of established mappings between concrete conceptual domains and abstract ones. That is, there is a pre-existent knowledge which constrains our possibility to produce and/or understand novel metaphoric expressions. For instance, a group of conventional conceptual metaphors which characterizes as a subset of the more general 'CHANGES ARE MOVEMENTS' metaphor is the following: 'BIRTH IS ARRIVAL', 'LIFE IS BEING PRESENT HERE', 'DEATH IS DEPARTURE' (cf. Lakoff and Johnson, 1980).</Paragraph> <Paragraph position="2"> Thus, we can say, for instance (examples are ours): - Nostro figlio e arrivato (= e nato) dopo dieci anni di matrimonio.</Paragraph> <Paragraph position="3"> (Our child arrived (= was born) ten years after our wedding) - Lui non e piu fra noi. (= e morto) (He is not with us anymore. (= he is dead)) - Se ne e andato (e morto) all'eta di venti anni. (He went away (he died) when he was twenty.) In IWN (or in the dictionaries considered) we find encoded the senses indicated in the examples for essere and andare but not for arrivare, even if this sense of the verb is attested (although infrequent) in the PAROLE corpus: c. ... di figli ne sono arrivati troppi.</Paragraph> <Paragraph position="4"> (there arrived too many children).</Paragraph> <Paragraph position="5"> If we then look for the senses provided for another verb, which we may potentially expect to display the same regular sense extension of andare as to die - lasciare (to leave) -, we do not find any relevant information in our lexical resources as well, although also this verb metaphoric sense occurs once in our corpus: d. Mentre scrivo ci ha appena lasciato. La sua morte...</Paragraph> <Paragraph position="6"> (While I'm writing he/she has just left us.</Paragraph> <Paragraph position="7"> His/her death...).</Paragraph> <Paragraph position="8"> In fact, these metaphoric uses of arrivare and lasciare, although not frequent in our corpus (com null posed of texts taken from newspapers, magazines, essays, novels, etc), are quite normal in everyday spoken language.</Paragraph> <Paragraph position="9"> In order to build a resource which actually accounts for our lexical-conceptual knowledge and can be used as a resource for NLP, we have to find a way to encode also knowledge about mappings between conceptual domains resulting in potential metaphoric expressions production. This information should be encoded at a higher level than the synset level, since it is information on regular polysemy affecting whole conceptual domains.</Paragraph> <Paragraph position="10"> In IWN, as in EWN, we have three fundamental levels of representation of semantic information: * the synset level, where language-specific synsets information is encoded; * the level of the linking to the Interlingual-Index (ILI - an unstructured list of WN 1.5 synsets) to which synsets from the specific wordnet point (by means of so-called 'equivalence-relations') in order to perform the linking between different language-specific wordnets; * the Top Ontology (TO), a hierarchy of language-independent concepts, reflecting fundamental semantic distinctions, which may (or may not) be lexicalised in various ways, or according to different patterns, in different languages: via the ILI, all the concepts in the language specific wordnet are directly or indirectly (via hyponymy relations) linked to the TO.</Paragraph> <Paragraph position="11"> The figure below exemplifies the relations among the three levels.</Paragraph> <Paragraph position="12"> Note that in EWN/IWN not all the synsets are directly linked to TO. Actually, only the so-called 'Base Concepts' (cf. Vossen, 1999) are explicitly linked to ILIs connected with Top Concepts. However, the links to Top Concepts are inherited by all the other synsets within the wordnets via hyponymy relations with the Base Concepts.</Paragraph> <Section position="1" start_page="7" end_page="8" type="sub_section"> <SectionTitle> Top Ontology </SectionTitle> <Paragraph position="0"> Since the distinctions at the level of the TO are language independent, it is necessary to show metaphoric regular polysemy found in a specific language at a different level. Indeed, there are culture-constrained differences in the metaphor system (see, e.g., the differences linked to orientation reported by Lakoff and Johnson, 1980, determining for instance that in some cultures the future is in front of us and in others the future is behind us) which should receive a representation at some other level.</Paragraph> <Paragraph position="1"> In EWN some cases of regular polysemy were dealt with at the level of the linking of each language-specific wordnet with the ILI. Via the ILI the generalizations over concepts were then projected to the TO. Generalizations were stated directly at the level of the ILI and automatically inherited from all the synsets which in a language-specific wordnet were linked to the ILI synsets involved in the generalizations themselves. An automatically added generalization could be later manually deleted in case it did not apply to a specific language (cf. Peters et al., 1998). For instance, the lexeme scuola (school) in Italian has got (among others) two related senses indicating one the institution and the other the building. This is a case of regular polysemy since many words indicating institutions also indicate buildings in Italian (as, of course, in other languages). Once the Italian school-institution and the school-building synsets were linked to the appropriate synsets in the ILI, the system automatically added to both Italian synsets another equivalence link, called EQ_METONYM, to a kind of 'composite ILI unit', clustering the 'institution' and 'building' ILI synsets into a coarser-grained sense group. Thus, our synsets, via the ILI, were linked to tops in the TO indicating concepts in different domains. A similar operation was automatically performed for senses reflecting diathesis alternations for verbs (related by EQ_DIATHESIS), such as causative and inchoative pairs. In case a kind of regular polysemy did not display in our language, the automatically generated link to the relevant composite ILI unit had to be manually deleted.</Paragraph> <Paragraph position="2"> We think that an EQ_METAPHOR relation pointing to new composite ILI units should be created to account for regular metaphoric extensions of senses in EWN/IWN. Via the ILI links the connection between specific synsets in a language would also be shown at the TO level as connection (mapping) between top concepts (linked to different conceptual domains). On the other hand, the mapping at the TO level could be used to infer which words might potentially display a certain metaphoric extension, when this is not encoded at the synset level. Indeed, the link to a Top Concept is inherited along taxonomies within the language-specific wordnets, thus all the synsets directly or indirectly connected (through hyponymy) with another synset would inherit the links to Top Concepts related to different conceptual domains.</Paragraph> <Paragraph position="3"> Thus, even when specific information on a possible metaphoric sense extension of a word is not encoded in the database it would be possible to derive it. Consider the case of lasciare (to leave), mentioned above, and related conceptual metaphors. This verb has 9 senses in IWN (i.e., it is found within 9 synsets), one of which (sense 2) is defined as &quot;andarsene da un luogo temporaneamente o definitivamente&quot; (to go away from a place temporarily or definitively): this verb sense is a (direct) hyponym of the {partire 1, andare via 1} synset which, via an equivalence link to the {go, go away, depart, travel away} ILI synset, is connected with the Top Concepts 'BoundedEvent', indicating change of state; 'Location', indicating that the change referred to is a change of location; and 'Physical', indicating that the change of loca-tion involves 'physical entities'. As was done in EWN for other kinds of sense extensions, a 'composite ILI unit' should be created, clustering the 'departure' ILI synset (already linked to our {partire 1, andare 1} synset) and the 'death' ILI synset and accounting for the 'DEATH IS DEPARTURE' conceptual metaphor: then, the Italian synset already manually linked to the 'departure' ILI synset would also be connected, through an EQ_METAPHOR relation, to the 'death' ILI synset.</Paragraph> <Paragraph position="4"> Consequently, the same synset would be, at the same time, connected both to the synset(s) indicating 'death' in Italian and to the relevant Top Concepts in the TO. All the hyponyms of {partire 1, andare via 1} would then inherit these connections; thus, also lasciare would display the same links, even if no specific information is encoded at the synset level. Again, cf. the figure below for a schematic representation of these relations.</Paragraph> <Paragraph position="5"> Note that for languages not displaying this sense extension the equivalence relation should be manually deleted.</Paragraph> <Paragraph position="6"> In this way, information on potential metaphoric uses of lasciare (and, of course, other words in the wordnet) could be retrieved, by going from Top Concepts to ILI synsets and then to language-specific wordnets.</Paragraph> </Section> </Section> class="xml-element"></Paper>