File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-4216_metho.xml
Size: 22,048 bytes
Last Modified: 2025-10-06 14:13:05
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-4216"> <Title>ANALYSING DICTIONARY DEFINITIONS OF MOTION VERBS</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> ANALYSING DICTIONARY DEFINITIONS OF MOTION VERBS ANTONIETTA ALONGE </SectionTitle> <Paragraph position="0"> Universith di Pisa-Dipanimento di Linguistica Via S. Maria, 36 - 561(X) PISA - Italy Tel. +39-50-560481; Fax +39-50-589055 e-mail: LEM1NTER @ ICNUCEVM.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> hi recent years many researchers working in the field of NLP have turned to large text resources such its machine-readable dictionaries (MRDs) and text corpora, in order to cope with the major problem of building a large computational lexicon. Within the Esprit project &quot;Acquilex ''1 the possibility of exploiting existing mono- and bi-lingual MRDs of four languages 2. in order to develop a multi-lingual and maximally reusable (by different researchers, for different NLP tasks) lexical knowledge base (LKB) is being explored m~d, indeed, we are finding evidence that important semantic and syntactic information can be semi-automatically extracted from these sources with significant saving in resources, comp,'u-ed to building a lexicon by hand.</Paragraph> <Paragraph position="1"> The task of extracting lexical information from MRDs is a non-trivial one iu that such information is mostly implicit in dictionaries and completely new techniques and methodologies have to be developed in order to achieve this goal. Furthermore, it is necessary both to take into account the peculiar characteristics of dictionary structures and to make theoretical hypotheses about the kind of information which can be useful for NLP systems and which therefore should be extracted. Indeed, these are two of the major issues being dealt with within Acquilex (another one being the issue regarding the representation of the knowledge extracted). Therefore, our work has been guided by theoretical hypotheses and empirical observations at the same time.</Paragraph> <Paragraph position="2"> First of all we assumed the centrality of the lexicon in the organization of natural languages; then, on the basis of the growing interest of different theoretical frameworks for semantic phenomena and of the fact that contemporary syntactic theories seem to converge on the hypothesis that syntactic structure is, to a large extent, determined by word meaning, we tried to see if it was possible to identify, within our dictionaries, that kind of semantic information on verbs which had been described as determining fundamental syntactic behaviours of the verbs themselves (cf. Levin 1985; Levin & Rappaport 1991). Finally, we tried to follow some indications, provided in works such as those of Pustejovsky (1989) or Boguraev & Pustejovsky (1990), relative to the kinds of lexical data which should be sought within MRDs and other computerized sources in order to be able to deal with problems facing the computational lingnist aiming at building components for NLP. According to Boguraev & Pustejovsky (1990: 39), i.e., tile following information should be individuated within sources of lexical data such as MRDs: argument structure; event structure (Aktionsart); qualia structure (Pnstejovsky (1989)); lexical inheritauce structure.</Paragraph> <Paragraph position="3"> Dictionary definitions are generally structured so that it is possible to identify two fundamental parts within them: the &quot;genus term&quot;, which is connected with the entry-word by means of an IS-A (or taxonomical) relation, and the &quot;differentia&quot; part, in which what differenliates the entry from its hypernym is indicated 3 Within Acquilcx we are improving and developing techniques aimed at exploiting both the taxonomical organization of information in dictionaries and the fact that recurrent structures and patterns are found in the differentia; such patterns are being individuated through a pattern-matching procedure (which operates on the output of the syntactic analysis of definitions) in order to associate them with corresponding relations or semantic / conceptual categories (for descriptions of methodologies for extracting taxonomies fiom MRDs, see Calzolari (1984) aud Chodorow, Byrd & fleidorn (1985). For discussion of this way of analysing the differentia, see Calzolari (1991)). In the following, a research being carried out within this project and aimed at developing techniques for extracting different kinds of information related to motion verbs is presented.</Paragraph> <Paragraph position="4"> 2. Information on motion verbs within dictionaries Motion verbs have been the subject of several studies by linguists because they present particularly interesting semantic and syntactic characteristics. In particular, even if they are often considered as being a coherent semantic class (and indeed we speak of &quot;motion verbs&quot; as a whole), we can find verbs displaying different semantic features and syntactic behaviour in this class. With our research, ACRES DE COLING-92. NANTES, 23..28 AOL&quot;r 1992 1 3 1 5 PROC. OF COLING-92, NANrES. AUG. 23-28, 1992 therefore, we have tried to individuate information on this class of verbs to be used to further classify them according to semantic characteristics which subsets of them present and which are connected with different syntactic behaviours.</Paragraph> <Paragraph position="5"> By analysing our mono-lingual Italian dictionaries and the Collins bi-lingual, it is possible to extract the following kinds of syntactic and semantic information on motion verbs, by means of semi-automatic procedures: 1) transitivity/ intransitivity/reflexivity4; 2) Aktionsart (or &quot;lexical aspect&quot;); 3) unaccusativity or unergativity (for intransitive verbs); 4) components of meaning, typical subjects, thematic roles.</Paragraph> <Paragraph position="6"> While the information in the first point is explicitly coded within our dictionaries, a procedure was developed in order to extract information on Aktionsart (i.e., to classify verbs in a Vendlerian fashion (Vendler (1967))) which exploits the taxonomical organization of data in dictionaries and the possibility of having inheritance of information as a consequence of the IS-A link. After classifying &quot;genus term&quot; verbs, we make each hyponym inherit the Aktionsart-class from its superordinate verb unless some specific patterns, recognized through a pattern-matching procedure, are found in the differentia part of its definition. If this is the case, the entry-verb considered is classified in a different way, according to specific rules stated in advance (see Alonge (1991)).</Paragraph> <Paragraph position="7"> Information on unaccusativity or unergativity of intransitive motion verbs (in Perlmutter's (1978) sense) is easily extracted for Italian by taking into consideration the auxiliary selected by a verb (unaccusatives take essere (to be)), which is indicated in the Collins bi-lingual.</Paragraph> <Paragraph position="8"> Finally, by analyzing the differentia part of definitions, we find information on components of meaning of verbs, typical subjects and also thematic roles. Components of meaning and typical subjects are individuated by identifying recurrent patterns, clearly referring to specific semantic categories, within the differentia. For the time being, a detailed analysis has been carried out in relation to the verbs within the taxonomy of muoversi (intransitive to move) in GRZ (244 entries); we first examined manually some of the definitions of verbs within the taxonomy and individuated recurrent patterns connected with components of meaning which were, therefore, considered potentially relevant to describe the semantics of the whole class of verbs, even if not every pattern was found in each definition. The following are examples of the patterns found within the definitions analysed and of the components of meaning which were connected with them: * MANNER of MOTION: ~on / come / a NP; AdvP; V-ing * GOAL: a / incontro a / v~rs9 NP; AdvP * SOURCE: . PATH: da .,. a / da ... verso NP * MEDIUM: per via di / in / ~.NP * PURPOSE: ~ VP * TYPICAL SUBJECT: d~tto di / si ~ti~:g C/_LL_,0LY_P A pattern-matching procedure is now being implemented which identifies patterns within the differentia and relates them to semantic categories (or typical subjects). Furthermore, each verb inherits the components of meaning connected with its superordinate verb(sp.</Paragraph> <Paragraph position="9"> Agitarsi (to toss) is defined (in GRZ, sense 1) as &quot;muoversi con vivacit~t, con irrequietezza, con violenza&quot; (&quot;to move vivaciously, restlessly, violently&quot;); the genus term muoversi simply refers to &quot;the fact of motion&quot; and is not connected with any other particular components of meaning; the PPs which we find in the differentia, instead, refer to MANNER of MOTION, so that our procedure will analyse agitarsi as connected with a MANNER of MOTION component of meaning. Fuggire (to escape), then, is defined (GRZ, 1) as &quot;allontanarsi di corsa, per lo pih per evitare un pericolo o un danno&quot; (&quot;to go away at a run, mostly to avoid a danger or a damage&quot;): since the genus term allontanarsi is related to a GOAL component of meaning, fuggire itself will be connected with the same component of meaning plus a MANNER of MOTION component (di and a PURPOSE component v_e~,am~..).</Paragraph> <Paragraph position="10"> Sometimes different semantic categories may be indicated by the same lexical category / pattern, so that it was necessary to define lists of specific (sequences of) words to be connected with one of the components of meaning in order to distinguish instances of it from instances of the other component related to the same pattern. For instance, the same preposition can be used in Italian to express different meaning categories. This is the case of the preposition a (at, in, to...): when coupled with most nouns (/noun phrases) it indicates GOAL (with motion verbs); however, when it is used in conjunction with certain nouns, idiomatic expressions are formed which refer to MANNER of MOTION. Therefore, in order to recognize the latter cases, we identified a limited list of such idiomatic expressions (which can be found with motion verbs) so that when one of these expressions is found within a definition the verb defined will have a MANNER of MOTION component of meaning, otherwise, it will have a GOAL component 6. Andare (to go), e. g., is found together with &quot;a cavallo&quot; (riding) within the definition of cavalcare (GRZ, 1) (to ride) ACTES DE COLING-92, NAN'IT.S, 23-28 AOUT 1992 1 3 l 6 PROC. OF COLING-92. NANTES, An(;. 23-28, 1992 and it is also found with &quot;a letto&quot; (to bed) within tbe definition of coricarsi (GRZ, 1) (to go to bed). In the first definition the idiomatic expression &quot;a cavallo&quot; indicates MANNER of MOTION, while in the second definitiou we have the indication of a GOAL; therefore, cavalcare will be connected with a MANNER of MOTION component, while coricarsi will refer to motion towards a GOAL. Similar procedures are being applied to other cases of ambiguities and it seems possible to claim that, even if it is necessary to do some work by hand, the utilization of MRDs and of such semi-automatic methodologies of analysis represents a significant saving both in time and resources.</Paragraph> <Paragraph position="11"> The same information on components of meaning was also ntilized to identify thematic proto-roles, according to Dowty's (1988) proposal, whicll has been adopted within Acquilex (Sanfilippo (1991)). Dowry individuated two sets of properties which contribute to the definition of &quot;prototypical&quot; agent and patient role and which are entailed by verb meaning:</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> * CONTRIBUTING PROPERTIES FOR TIlE PROTO-AGENT ROLE: </SectionTitle> <Paragraph position="0"> volition, sentience (and / or perception) canses event, movement</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> * CONTRIBUTING PROPERTIES FOR THE PROTO-PATIENT ROLE: </SectionTitle> <Paragraph position="0"> change of state, incremental theme, causally affected by event, stationary.</Paragraph> <Paragraph position="1"> The argument tmving the highest nmnber of proto-agent properties entailed by the meaning of the verb, and inherited by default, is to be associated with the proto-agent role; tim argument of a transitive verb to which the highest number of proto-patient properties can be ascribed (inherently via entaihnent relations, and by default), instead, is to be associated with the proto-patient role (cf. Sanfilippo (1991)).</Paragraph> <Paragraph position="2"> Thus, since fundamental questions about the identification, individuation, and even the theoretical status of &quot;traditional&quot; thematic roles remain unresolved, we decided to detemfine the semantic content of these basic roles by taking into account proPerties which are needed for verb classification and which can be identified through the analysis of definitions described above.</Paragraph> <Paragraph position="3"> By examining the verbs within the taxonomy of muoversi, we saw that even if they can be either strict intransitives, or strict transitives, or intransitives taking an oblique object, they all imply a subject argument (a proto-agent) which corresponds to the &quot;moving object&quot; and for which either the manner of motion or the direction (and therefore, a change of position) can be inherently specified. The information that it is the subject of these verbs which is moving is inherited from the genus term muoversi; the specification relative to the nmnner of motion or the change of position is found within the differentia of definitions and used to encode more information in relation to the &quot;moving object&quot; itself. I.e., if we take into consideration the definitions of the verbs andare (to go) and oscillare (to swing) given below, we may see that in relation to the former verb the proto-agent moves along a path, while in relation to the latter the manner of motion of the proto-agent is inherently specified: * andare : muoversi da un luogo verso un altro (GRZ, 1) (to move from one place to another) deg oscillare : muoversi alternamente in qua e in l~t o in sue in gift (GRZ, 1) (to move alternately bern and there or up and down) 3. Data from dictionaries and theoretical works The data wc have extracted from dictionaries were indicated as necessary for a computational lexicon within theoretical works (cf. above).</Paragraph> <Paragraph position="4"> Furthermore, the data related to components of meaning can be compared to the results of theoretical works in which connections among semantic and syntactic characteristics of motion verbs were identified, in order both to verify the validity of hypothesis put forward on the basis of observation of limited data and to derive information on syntactic behaviours of large amonnts of verbs.</Paragraph> <Paragraph position="5"> Tahny (1985) deeply investigated the relationships among surface expressions related to tnotion verbs and their semantics across languages, lie individuated different ways of &quot;conflating&quot; components of meaning across languages which relate to the different syntactic configurations allowed. The ability to refer to these lexicalization patterns, some of which are Ulfiversal across languages while some vary systematically defining typologies of languages, can help structure a nmlti-lingual LKB of the sort we are building. Actually, in dictionary definitions we find usefid information in relation to the kind of &quot;conflation&quot; of components of meaning which are allowed in Italian and it is possible to compare our data with Talmy's analyses. While English (but also Chinese, etc.) can express at once the &quot;fact of Motion&quot; and its manner, (post-Latin) Romance languages cannot, according to Talmy. However in Italian many verbs express at once &quot;the fact of MOTION&quot; and MANNER; what they do not often express is what Talmy (1985: 141) calls &quot;transhttional motion&quot;, i.e. change of position, and MANNER together 7. For most verbs in Italian what is relevant is either the manner of motion, like, e.g., for camminare (to walk) or the change of position, like, e.g., for coricarsi (to go to bed). Actu',dly, in dictionary definitions ACRES DE COLING-92, NANIT..S. 23-28 Ao(rr 1992 13 1 7 I'ROC. Ol: COI.ING-92, NANrES, AOO. 23-28, 1992 we find information on this kind of &quot;conflation&quot; of components of meaning: * camminare : andare ~ (GRZ, 1) (the pattern underlined refers to MANNER of MOTION); * coricarsi: andare aletto (GRZ, 1) (the pattern underlined indicates GOAL).</Paragraph> <Paragraph position="6"> Nevertheless, in Italian there are some manner of motion verbs which behave somewhat differently. They are manner of motion verbs which are both unergative and unaccusative.</Paragraph> <Paragraph position="7"> Correre (to run) has both an unergative and an unaccusative use; when it used in the unaccusative form it refers both to MANNER of MOTION and GOAL, as can be seen in the examples below: * Giovanni ha corso per tre ore/~ - Giovanni ~ torso Both sentences can indeed be translated as &quot;Giovanni ran&quot;, but only with the unaccusative form (with the auxiliary essere), the goal expression &quot;a casa&quot; is allowed. Information on this characteric of correre can be found within DMI, where even if correre is defined as &quot;andare velocemente&quot; (&quot;to go fast&quot;, where the adverb refers to MANNER of MOTION), it is also stated that when this verb is used with the auxiliary essere (and, therefore, is unaccusative) it implies a GOAL.</Paragraph> <Paragraph position="8"> Italian may express at once, as noted above, the fact of motion and a GOAL, but also motion plus SOURCE or PATH, with such verbs like entrare, uscire, passare, salire, etc., which have direct counterparts in English, even if &quot;these verbs (and the sentence patterns they call for) are not the most characteristic of English&quot;, according to Talmy, (1985: 72); indeed verbs such as enter, exit, pass, descend, etc. are borrowings from Romance. The fact that this kind of conflation is typical of Italian (and Romance) but not of English is further demonstrated by the existence of verbs such as the above mentioned coricarsi, or esulare (to go into exile), etc. which have no direct correspondent verbs in English. Also with respect to these verbs we find useful information within dictionary definitions, as can be seen in the examples below: (patterns indicating PATH / GOAL have been underlined).</Paragraph> <Paragraph position="9"> According to Talmy, then, motion can never be conflated with PURPOSE. Nevertheless, among the verbs we analysed there are some which seem to incorporate a purpose together with motion. Passare is defined (in GRZ, 1) as: &quot;muoversi attraversando, percorrendo un luogo, per andare in un altro&quot; (&quot;to move crossing one place, in order to go to another one&quot;), where &quot;per andare in&quot; refers to the purpose of the action indicated by the verb. Ifpassare has to be seen as having this component of meaning, then all its hyponyms should inherit it and so there would be some verbs indicating at once motion and purpose 8.</Paragraph> <Paragraph position="10"> Levin and Rappaport (1991) further investigated intransitive motion verbs and claimed that, on the basis of their syntactic behaviour / semantic features, it was possible to distinguish among three classes of intransitive motion verbs: Furthermore, these classes can be related systematically to Vendlerian classification (based on Aktior~art distinctions; Vendler (1967)).</Paragraph> <Paragraph position="11"> Information found within intransitive motion verb definitions, therefore, was used also for dividing verbs according to the distinctions individuated by Levin & Rappaport.</Paragraph> <Paragraph position="12"> By analysing our data we found evidence that the component of meaning which is relevant to identify &quot;arrive&quot; verbs is that of GOAL (DIRECTION); furthermore, the component of GOAL seems to be relevant also when it is missing. That is, the lack of such a component indicates manner of motion, even if we do not find patterns related to MANNER within the definition. Volare (GRZ, 1) is defined as &quot;muoversi in aria&quot; (&quot;to move in the air&quot;), where &quot;in aria&quot; indicates the MEDIUM and not the manner of motion. However, the fact that we do not find an indication of a GOAL component seems sufficient for classifying the verb as a manner of motion verb and not a change of position one. In order to decide, then, if it is a &quot;run&quot; or a &quot;roll&quot; verb (see above) we use the information oll unaccusativity / unergativity, since in our dictionaries we do not find any references to the existence of control on the part of an agent. Actually, further study seems to be needed with respect to such a component of meaning and its relation to unaccusativity / unergattvlty, because by analysing our data we found manner of motion unaccusative verbs which seem to imply protagonist control (which would contradict the hypothesis put forward by Levin and Rappaport).</Paragraph> </Section> class="xml-element"></Paper>