File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/75/t75-2007_abstr.xml
Size: 20,664 bytes
Last Modified: 2025-10-06 13:45:44
<?xml version="1.0" standalone="yes"?> <Paper uid="T75-2007"> <Title>COMMENTS ON LEXICAL ANALYSIS*</Title> <Section position="1" start_page="0" end_page="33" type="abstr"> <SectionTitle> COMMENTS ON LEXICAL ANALYSIS* </SectionTitle> <Paragraph position="0"> How lexical information should be represented in a computer program for processing natural language depends both on the goals that the program is intended to achieve and on the lexical information itself. Although programs can be imagined that might use lexical information in different ways, the information base that is exploited must be invariant over alternative programs.</Paragraph> <Paragraph position="1"> The present paper is concerned with the lexical information that must be represented, rather than with programming devices for representing it. First, an analysis scheme will be illustrated through a study of a single English verb. Then the scheme will be used as background for a discussion of some fundamental theoretical issues.</Paragraph> <Paragraph position="2"> Hand:. An exercise in Lexical Analvsis Consider the verb &quot;hand&quot; as it is used in: (I) a. She handed her hat to him.</Paragraph> <Paragraph position="3"> b. She handed him her hat.</Paragraph> <Paragraph position="4"> A paraphrase of (I) that captures all of the components of meaning to be disuussed here is: (2) She had her hat prior to some time t at which she used her hand to do something that caused her hat to travel to him, after which time he had her hat.</Paragraph> <Paragraph position="5"> The difference between (la) and (Ib) is usually regarded as syntactic, (Ib) deriving from the structure underlying (la) as a consequence of a dative-movement transformation that inverts the order of the direct and indirect objects and deletes &quot;to&quot;. Some people, however, detect a difference in meaning: &quot;She handed her hat to him,&quot; they say, merely suggests that he took it, whereas &quot;She handed him her hat&quot; asserts that he took it -- the sense expressed in (2). If one respects this difference in meaning, and also holds to the semantic neutrality of such grammatical transformations as dative movement, then presumably one must distinguish two different meanings of &quot;hand&quot; -- one resembling 'offer&quot; and another offer-and-take.&quot; If one does not respect this meaning difference, both (la) and (Ib) have the &quot;offer-and-take&quot; sense. In either case, it is the sense paraphrased in (2) that will be considered here.</Paragraph> <Paragraph position="6"> Let the verb &quot;hand&quot; be represented as an operator, HAND, taking three arguments: the grammatical subject x, the indirect *Preparation of this paper has been supported by grants to The Rockefelier University from the Grant Foundation and from the Public Health Service, GM21796.</Paragraph> <Paragraph position="7"> The style of lexical analysis that is illustrated here was developed in collaboration with P.N. Johnson-Laird and will appear in more detail in Miller and Johnson-Laird (in press). object y, and the direct object z: HAND(x,y,z). Then (I) can be represented (to a first approximation) by:</Paragraph> <Paragraph position="9"> WOMAN, MAN, and HAT are not uninteresting concepts -- in particular, men and women have&quot; hands (inalienable possession), whereas hats do not, and men and women can have&quot; hats (either accidental possession or ownership), but not vice versa -- but the present discussion is confined to HAND, which will be analyzed to illustrate the need for certain very general lexical concepts.</Paragraph> <Paragraph position="10"> HAPPEN: Consider first the temporal shape of the handing episode in (I). It begins in the state: &quot;she has her hat and he does not have it&quot;. Then an event occurs at time t which results in a change of state.</Paragraph> <Paragraph position="11"> And the episode ends in the state: &quot;she does not have her hat and he does have it&quot;. This characterization raises two questions: how to represent changes of state, and how to reduce the redundancy of the state descriptions.</Paragraph> <Paragraph position="12"> A statement forming operator R (Rescher & Urquhart, 1971) Can be used to represent changes of state. R takes a temporally indefinite statement S and forms a new statement R (S) to the effect that &quot;S is realized at t.&quot; In order to indicate a change of state at moment t, another operator -- call it HAPPEN -- is needed to form a new statement to the effect that &quot;notS is realized at t-1 and S is,realized</Paragraph> <Paragraph position="14"> HAPPEN is a very general operator, characteristic of verbs that denote events. Note that the first conjunct of (4) will ordinarily be presupposed; that is to say, &quot;S didn't happen&quot; is not ordinarily taken to mean &quot;R~(S) for all t.&quot; The two state characterizations -- &quot;she had it and he didn't&quot; and &quot;she didn't have it and he did&quot; -- are clearly redundant.</Paragraph> <Paragraph position="15"> The fact that a hat cannot be in two places at the same time (which must be part of a language user's general knowledge) merely compounds the redundancy of such state descriptions for double-object verbs of motion. However, it is a general characteristic of double-object verbs, not limited to motion verbs like &quot;hand&quot;, that, in some sense of the ambiguous verb &quot;have&quot;, the event ends with the indirect object &quot;having&quot; the direct object (Green, 1974). In the case of &quot;hand&quot;, either x or y, but not both, will have z at any moment t; since y has z after t, x cannot also have it. On the other hand, if x &quot;tells&quot; y some information z, x does no~t stop having z after t. What is common to both, however, is that y does not have z before t. Thus, the simplest state description is S = HAVE(y,z), in which case the antecedent state would be notS. Since nots seems to be presupposed by HAND, (4) would be satisfied.</Paragraph> <Paragraph position="16"> On this analysis, therefore, some part of</Paragraph> <Paragraph position="18"> the meaning of &quot;hand&quot; must be:</Paragraph> <Paragraph position="20"> Actually, of course, two things happen in handing: the object changes location as well as possessor. Indeed, the former change seems to be causally related to the latter. So, in order to complete the analysis, it is necessary to conside~ also what happens at t that results in the transition from notHAVE(y,z) to HAVE(y,z).</Paragraph> <Paragraph position="21"> Roughly, x uses x's hand to d~ something, and what x does causes z to travel to y.</Paragraph> <Paragraph position="22"> This paraphrase introduces four new operators -- USE, DO, CAUSE, and TRAVEL -- which can combine as follows to provide additional parts of HAND:</Paragraph> <Paragraph position="24"> Because the concepts associated with these operators -- instrumentality, agency, causality, and motion -- are required in the analysis of many English verbs, they will be discussed individually.</Paragraph> <Paragraph position="25"> USE: The first conjunct of (6) corresponds to &quot;x uses hand to S~&quot; or, more generally, USE(x,w,S,) is &quot;x uses w to Sx,&quot; as in &quot;Tom used a knife to open the box.&quot; A fuller paraphrase would be: &quot;x intentionally does something S that causes w to do something S&quot; that allows Sx.&quot; &quot;Use&quot; contrasts with instumental &quot;with&quot; in being intentional: &quot;He broke the window with his elbow&quot; is not synonymous with &quot;He used his elbow to break the window.&quot; If we introduce an operator ACT to represent intentional acts, then USE can be defined:</Paragraph> <Paragraph position="27"> This formulation adds two more operators -- ACT and ALLOW -- for which an account must be given.</Paragraph> <Paragraph position="28"> ACT: Intention will be taken as an unanalyzed primitive and represented by INTEND(x,g), where x is understood to be animate and g is understood to be a goal that x intends to achieve. It is further assumed that intentions can stand in a causal relation to behavior, so:</Paragraph> <Paragraph position="30"> intentional counterpart of unintentional DO.</Paragraph> <Paragraph position="31"> DO: Let S denote a statement whose grammatical subject is x and whose predicate phrase is an event description (i.e., whose predicate entails HAPPEN). Then the relation between x and the event will be DO(x,S). DO is essentially a place holder.</Paragraph> <Paragraph position="32"> That is to say, DO will be restricted to contexts in which S can be a dummy variable -- see (7) for example, where DO(w,S') can be paraphrased as &quot;w does something.&quot; If S cannot be a dummy variable -- if what x does is relevant to the meaning -- then DO will be replaced by an operator that makes the action explicit.</Paragraph> <Paragraph position="33"> CAUSE: Causation is too complex for brief explication. The following formulation is simply lifted from Miller and Johnson-Laird:</Paragraph> <Paragraph position="35"> This formulation adds two more operators -- BEFORE and POSSIBLE -- for which accounts are needed. It is obvious that the plausibility of (9) must depend very heavily on POSSIBLE and on how a language user acquires general knowlldge about what combinations of events are possible or impossible. Lacking any clear psychological theory, POSSIBLE can be taken as a primitive, undefined term.</Paragraph> <Paragraph position="36"> ALLOW: &quot;Cause&quot; and &quot;allow&quot; are closely related, as a comparison of (9) with the following formulation shoud show:</Paragraph> <Paragraph position="38"> Note that, although it is impossible for S&quot; to occur unless S has occurred, the occurrence of S does not insure the subsequent occurrence of S'; that is to say, (S and notS') may well be possible.</Paragraph> <Paragraph position="39"> BEFORE: Sentences of the form &quot;S before S'&quot; can be interpreted to mean that there is some moment t such that S has been realized at t and S&quot; has not yet been realized -- that there is an interval between the first realization of S and the first realization of S'. In terms of the</Paragraph> <Paragraph position="41"> TRAVEL: According to Miller (1972), verbs of motion constitute a semantic field of English having &quot;change of location&quot; or, more briefly, &quot;travel&quot;, as the core concept. It is sufficient evidence that something has traveled if one notices that it has appeared where it wasn't before, or if one notices that it is no longer where it was before.</Paragraph> <Paragraph position="42"> These conditions are accommodated by:</Paragraph> <Paragraph position="44"> for an appropriate choice of the location y as the origin or destination of motion. The first disjunct represents &quot;z travels to y&quot; and the second &quot;z travels from y.</Paragraph> <Paragraph position="45"> Miller and Johnson-Laird adopt the convention of using A(B(x)) for sentential adverbials and (A(B))(x) for predicate adverbials, so the notation:</Paragraph> <Paragraph position="47"> reflects a judgment that &quot;to y&quot; is a predicate adverbial in z traveled to y.</Paragraph> <Paragraph position="48"> This analysis of TRAVEL, however, introduces still another operator, AT.</Paragraph> <Paragraph position="49"> AT: The form &quot;z is at y&quot; seems to mean that z is included in the characteristic region of interaction with y. Miller and</Paragraph> <Paragraph position="51"> The second conjunct is required to distinguish &quot;at&quot; from &quot;with'. If z and y are commensurate, so that INCL is symmetrical between them, &quot;with&quot; is the preferred preposition.</Paragraph> <Paragraph position="52"> The two operators used to define AT can both be taken as primitive concepts. The relation of spatial inclusion that is supposed to be captured by INCL probably derives rather directly from perception of spatial relations. REGION, an operator indicating the characteristic region of interaction with its argument, derives from ~eneral knoweldge of objects and their uses. HAND: Enough machinery has now been introduced to provide some rationalization for the following formulation: according to which x's action merely allows y to get z, rather than causes y to get z. As detailed as this analysis is, some omissions are obvious. For example, the noun &quot;hand&quot; introduced in (6) as the instrument x uses is not only undefined, but no explicit indication is given that the hand x uses is x s own hand. This relation of inalienable possession could be introduced, of course, by adding an appropriate HAVE relation between x and the hand in question, but one feels that this goes beyond the limits of lexicology -- the fact that people have hands and enjoy a special user s privilege with respect to them is surely part of one s general knowledge about people. Also omitted is any recognition of one s intuition that, when x hands z to y, not only does x use his hand to deliver z, buy y also uses his hand to receive it -- one would not ordinarily say, for example, &quot;I handed him his dinner&quot; if what one had done was to use one's hand to feed the food into his mouth. The characteristic region of interaction with the recipient is, in this case, his hand.</Paragraph> <Paragraph position="53"> Moreover, &quot;hand&quot; seems to implicate y's conscious acknowledgement that he has received z -- one would not ordinarily say :~I handed it to him&quot; if what one had done was to slip it surreptitiously into his coat pocket. Some of these features of &quot;hand&quot; could be introduced by definin~ GET in the third conjunct of (16) to something like: USE(y,hand,ACCEPT(y,z)), with an appropriate formula for ACCEPT. Also omitted are any explicit grounds for distinguishing handing from throwing -- something more would have to be said about the temporal shape of the transfer. No doubt there are still other omissions. Since the present discussion of &quot;hand&quot; is merely an expository device to motivate the introduction of certain very general semantic operators, however, the definition offered in (15) will be left incomplete.</Paragraph> <Paragraph position="54"> The general problem of completeness requires comment. How far one should go in adding such features to a lexical analysis is an important question of lexicology for which a principled answer could be most useful. There is at present no way to refute the claim that, after all general components of meaning have been specified as fully as possible, there will always be a residuum of meaning unique to each particular lexical item.</Paragraph> <Section position="1" start_page="32" end_page="33" type="sub_section"> <SectionTitle> Some Theoretical Alternatives </SectionTitle> <Paragraph position="0"> Beginning with the English verb &quot;hand', several paths were followed in search of its lexical primitives. What turned up were such things as the symbols of first-order predicate calculus, the concept of state, the generic concept of possession, the modal operators R~ and POSSIBLE, the psychological operator INTEND, and the spatial operators INCL and REGION. This is not an exhaustive list, but it is illustrative. In each case, a level of general knowledge was reached that went beyond the usual bounds of a lexicon. In terms of these primitive operators it was then possible to offer formulas for such general and important operators as HAPPEN, USE, ACT, DO, CAUSE, ALLOW, BEFORE, TRAVEL, and AT; these; in turn, made possible a first approximation of HAND.</Paragraph> <Paragraph position="1"> What is the status of these various operators? In what sense are any of them lexically &quot;primitive&quot;? Or, to ask a closely related question, what is the status of the - that occurs in so many of the derivations? Two kinds of answer can be suggested, one too strongand the other too weak.</Paragraph> <Paragraph position="2"> Somewhere between them seems the best place to search.</Paragraph> <Paragraph position="3"> COMPLETE DECOMPOSITION: Probably the strongest claim one cou\].d hope to make would resemble the fundamental theorem of arithmetic, e.g, any lexical item can be expressed as a unique (Cartesian?) product of prime lexical items. In this case, = would be a reflexive, symmetric, and transitive relation -- synonymy -- between lexical items. Since one reason for playing the lexical decomposition game seems to be the hope of reducing lexical variety to a</Paragraph> <Paragraph position="5"> relatively small set of concepts, some workers might also insist that the number of lexical primes be finiie. (Note, incidentially, that the number of different primes into which an integer can be factored has little to do with the &quot;complexity&quot; of that integer, except in very special tasks; that is to say, for most tasks, reaction times to an integer would not correlate with its number of prime factors. Presumably the same could be said of the lexical version of this hypothesis.) COMPLETE INDIVIDUALITY: Probably the weakest claim one would care to make is that each individual lexical item is a unique prime in its own right; just as one person cannot be decomposed into some combination of other persons, so no lexical item can be decomposed into others. Various shared properties might be used to partition the lexicon, and varous relations might be found to hold between many pairs of lexical items, but such properties and relations could not be regarded as conceptual atoms from which lexical items are bult or to which they can be reduced. In this case, = would not hold between lexical items, although it might be a convenient metalinguistic relation between various properties, relations, or other theoretical statements.</Paragraph> <Paragraph position="6"> (Note again that no correlations would be expected between reaction times and &quot;complexity&quot;, since all individual lexical items are, presumably, equally complex.) Although some theorists might be interpreted as embracing one or the other of these alternaties, it seems more plausible to regard them as upper and lower bounds.</Paragraph> <Paragraph position="7"> Complete decomposition is implausible in view of the difficulty lexicographers have in providing complete definitions; there is usually some residuum of meaning, often but not necessarily affective, that vitiates the equivalence relation. Complete individuality is inadequate to explain the rich and relatively consistent patterns of properties and relations that have been described.</Paragraph> <Paragraph position="8"> So, one is led to speculate about intermediate alternatives. For example: Suppose there were many lexical items having the characteristic that, whenever they occurred in a simple declarative statement, that statement's verification required the execution of some particular cognitive (perceptual or memorial) test. That test would, of course, partition the lexicon into those items thatneeded it vs. those that did not, as the individuality hypothesis suggests. One might go further, however, and argue that the need to perform this test and its acceptable outcome must be indicated explicitly in the information associated with those lexical items and so, in a real sense, it can be said to be &quot;incorporated&quot; into their meanings (Gruber, 1965). The goal of analysis would be to determine which items incorporated it, or in short, to decompose such items into that test plus anything else required for verification.</Paragraph> <Paragraph position="9"> This program falls short of complete decomposition in that: (I) it is a decomposition of words into cognitive entities, like tests, rather than into other words; (2) the method of incorpoation is left unspecified, but would surely be more complex than taking a Cartesian product; and (3) there is no guarantee that decomposition will be complete without introducing more cognitive entities than there are lexical items to be defined, i.e., the problem of the residuum is unresolved. But a sort of limited decomposition would be possible.</Paragraph> </Section> </Section> class="xml-element"></Paper>