File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/a88-1032_metho.xml
Size: 23,306 bytes
Last Modified: 2025-10-06 14:12:06
<?xml version="1.0" standalone="yes"?> <Paper uid="A88-1032"> <Title>Localizing Expression of Ambiguity</Title> <Section position="3" start_page="0" end_page="236" type="metho"> <SectionTitle> 2 Range of Phenomena </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="235" type="sub_section"> <SectionTitle> 2.1 Attachment Possibilities </SectionTitle> <Paragraph position="0"> There are three representative classes of attachment ambiguities, and we have implemented our approach :o each of these. For each class, we give representative examples and show the relevant logical form fragments that encode the set of possible attachments.</Paragraph> <Paragraph position="1"> In the first class are those constituents that may attach to either nouns or verbs.</Paragraph> <Paragraph position="2"> (3) John saw the man with the telescope.</Paragraph> <Paragraph position="3"> The prepositional phrase (PP) &quot;with the telescope&quot; can be attached either to &quot;the man&quot; or to &quot;saw&quot;. If m stands for the man, t for the telescope, and e for the seeing event, the neutral logical form for the sentence includes ... A with(st, t) A \[st = m V y = e\] A ...</Paragraph> <Paragraph position="4"> That is, something St is with the telescope, and it is either the man or the seeing event.</Paragraph> <Paragraph position="5"> Gerund modifiers may also modify nouns and verbs, resuiting in ambiguities like that in the sentence I saw the Grand Canyon, flying to New York.</Paragraph> <Paragraph position="6"> Their treatment is identical to that of PPs. If g is the Grand Canyon, n is New York, and e is the seeing event, the neutral logical form will include ... A fist(st, n) ^ \[st = g V St = e\] A ...</Paragraph> <Paragraph position="7"> That is, something St is flying to New York, and it is either the Grand Canyon or the seeing event.</Paragraph> <Paragraph position="8"> In the second class are those constituents that can only attach to verbs, such as adverbials.</Paragraph> <Paragraph position="9"> George said Sara left his wife yesterday.</Paragraph> <Paragraph position="10"> Here &quot;yesterday&quot; can modify the saying or the leaving but not &quot;his wife&quot;. Suppose we take yesterday to be a predicate that applies to events and specifies something about their times of occurrence, and suppose el is the leaving event and e2 the saying event. Then the neutral logical form will include ... ^ Ue~terdast(st) ^ \[st = el v U = e2\] ^ ...</Paragraph> <Paragraph position="11"> That is, something y was yesterday and it is either the leaving event or the saying event.</Paragraph> <Paragraph position="12"> Related to this is the case of a relative clause where the preposed constituent is a PP, which could have been extracted from any of several embedded clauses. In That was the week during which George thought Sam told his wife he was leaving, the thinking, the telling, or the leaving could have been during the week. Let w be the week, el the thinking, e2 the telling, and es the leaving. Then the neutral logical form will include ... A during(st, w) A \[y = et V y = e2 V st = es\] A ...</Paragraph> <Paragraph position="13"> That is, something y was during the week, and y is either the thinking, the telling, or the leaving.</Paragraph> <Paragraph position="14"> The third class contains those constituents that may only attach to nouns, e.g., relative clauses.</Paragraph> <Paragraph position="15"> This component recycles the oil that flows through the compressor that is still good.</Paragraph> <Paragraph position="16"> The second relative clause. &quot;that is still good,&quot; can attach to &quot;'compressor&quot;, or &quot;oil&quot;, but not to &quot;flows&quot; or ~'recycles;'. Let o be the oil and c the compressor. Then, ignoring &quot;still&quot;, the neutral logical form will include ... ^ 9ood(st) A \[st = e V St = o\] A ...</Paragraph> <Paragraph position="17"> That is, something y is still good, and y is either the compressor or the oil.</Paragraph> <Paragraph position="18"> Similar to this are the compound nominal ambiguities, as in He inspected the oil filter element.</Paragraph> <Paragraph position="19"> &quot;Oil&quot; could modify either &quot;filter&quot; or &quot;element&quot;. Let o be the oil, f the filter, e the element, and nn the implicit relation that is encoded by the nominal compound construction. Then the neutral logical form will include ... A ,~n(/, e) ^ nn(o, St) A \[st = / V St = e\] ^... That is, there is some implicit relation nn between the filter and the element, and there is another implicit relation nn between the oil and something y, where y is either the filter or the element.</Paragraph> <Paragraph position="20"> Our treatment of all of these types of ambiguity has been implemented.</Paragraph> <Paragraph position="21"> In fact, the distinction we base the attachment possibilities on is not that between nouns and verbs, but that between event variables and entity variables in the logical form. This means that we would generate logical forms encoding the attachment of adverbials to event nominaiizations in those cases where the event nouns are translated with event variables. Thus in I read about Judith's promotion last year.</Paragraph> <Paragraph position="22"> &quot;last year&quot; would be taken as modifying either the promotion or the reading, if &quot;promotion&quot; were represented by an event variable in the logical form.</Paragraph> </Section> <Section position="2" start_page="235" end_page="236" type="sub_section"> <SectionTitle> 2.2 Single or Multiple Parse Trees </SectionTitle> <Paragraph position="0"> In addition to classifying attachment phenomena in terms of which kind of constituent something may attach to, there is another dimension along which we need to classify the phenomena: does the DIALOGIC parser produce all possible parses, or only one? For some regular structural ambiguities, such as very compound nominals, and the &quot;during which&quot; examples, only a single parse is produced. In this case it is straightforward to produce from the parse a neutral representation encoding all the possibilities. In the other cases, however, such as (nonpreposed) PPs, adverbials, and relative clauses, DIALOGIC produces an exhaustive (and sometimes exhausting) list of the different possible structures. This distinction is an artifact of our working in the DIALOGIC system. It would be preferable if there were only one tree constructed which was somehow neutral with respect to attachment. However, the DIALOGIC grammar is large and complex, and it would have been difficult to implement such_ an approach. Thus, in these cases, one of the parses, the one corresponding to right association \[Kimball, 1973\], is selected, and the neutral representation is generated from that. This makes it necessary to suppress redundant readings, as described below. (In fact, limited heuristics for suppressing multiple parse trees have recently been implemented in DIALOGIC.) null</Paragraph> </Section> <Section position="3" start_page="236" end_page="236" type="sub_section"> <SectionTitle> 2.3 Thematic Role Ambiguities </SectionTitle> <Paragraph position="0"> Neutral representations are constructed for one other kind of ambiguity in the TACITUS system--ambiguities in the thematic role or case of the arguments. In the sentence It broke the window.</Paragraph> <Paragraph position="1"> we don't know whether &quot;it&quot; is the agent or the instrument. Suppose the predicate break takes three arguments, an agent, a patient, and an instrument, and suppose x is whatever is referred to by &quot;it&quot; and w is the window. Then the neutral logical form will include ... A break(yl,w, y2) A \[y, = z V Y2 = x\] A ...</Paragraph> <Paragraph position="2"> That is, something Yl breaks the window with something else Y2, and either yl or y2 is whatever is referred to by &quot;it&quot; .1</Paragraph> </Section> <Section position="4" start_page="236" end_page="236" type="sub_section"> <SectionTitle> 2.4 Ambiguities Not Handled </SectionTitle> <Paragraph position="0"> There are other types of structural ambiguity about which we have little to say. In They will win one day in Hawaii, one of the obvious readings is that &quot;one day in Hawaii&quot; is an adverbial phrase. However, another perfectly reasonable reading is that &quot;one day in Hawaii&quot; is the direct object of the verb &quot;win&quot;. This is due to the verb having more than one subcategorization frame that could be filled by the surrounding constituents. It is the existence of this kind of ambiguity that led to the approach of not having DIALOGIC try to build a single neutral representation in all cases. A neutral representation for such sentences, though possible, would be very complicated.</Paragraph> <Paragraph position="1"> Similarly, we do not attempt to produce neutral representations for fortuitous or unsystematic ambiguities such as those exhibited in sentences like They are flying planes.</Paragraph> <Paragraph position="2"> Time flies like an arrow.</Paragraph> <Paragraph position="3"> Becky saw her duck.</Paragraph> <Paragraph position="4"> 1The treatment of thematic role ambiguities has been implemented by Paul Martin as part of the interface between DIALOGIC and the pragmatic processes of TACITUS that translates the logical forms of the sentences into a canonical representation.</Paragraph> </Section> <Section position="5" start_page="236" end_page="236" type="sub_section"> <SectionTitle> 2.5 Resolving Ambiguities </SectionTitle> <Paragraph position="0"> It is beyond the scope of this paper to describe the pragmatics processing that is intended to resolve the ambiguities (see Hobbs and Martin, 1987). Nevertheless, we discuss one nontrivial example, just to give the reader a feel for the kind of processing it is. Consider the sentence We retained the filter element for future analysis.</Paragraph> <Paragraph position="1"> Let r be the retaining event, f the filter element, and a the analysis. Then the logical form for the sentence will include ... A for(y,a) A \[y= f V y=r\] A ...</Paragraph> <Paragraph position="2"> The predicate for, let us say, requires the relation enable(y, a) to obtain between its arguments. That is, if y is for a, then either y or something coercible from y must somehow enable a or something coercible from a. The TACITUS knowledge base contains axioms encoding the fact that having something is a prerequisite for analyzing it and the fact that a retaining is a having, y can thus be equal to r, which is consistent with the constraints on y.</Paragraph> <Paragraph position="3"> On the other hand, any inference that the filter element enables the analysis will be much less direct, and consequently will not be chosen.</Paragraph> </Section> </Section> <Section position="4" start_page="236" end_page="237" type="metho"> <SectionTitle> 3 The Algorithm </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="236" end_page="237" type="sub_section"> <SectionTitle> 3.1 Finding Attachment Sites </SectionTitle> <Paragraph position="0"> The logical forms (LFs) that are produced from each of the parse trees are given to an attachment-finding program which adds, or makes explicit, information about possible attachment sites. Where this makes some LFs redundant, as in the prepositional phrase case, the redundant LFs are then eliminated.</Paragraph> <Paragraph position="1"> For instance, for the sentence in (4), (4) John saw the man in the park with the telescope.</Paragraph> <Paragraph position="2"> DIALOGIC produces five parse trees, and five corresponding logical forms. When the attachment-finding routine is run on an LF, it annotates the LF with information about a set of variables that might be the subject (i.e., the attachment site) of each PP.</Paragraph> <Paragraph position="3"> The example below shows the LFs for one of the five readings before and after the attachment-finding routine is run on it. They are somewhat simplified for the purposes of exposition. In this notation, a proposition is a predicate followed by one or more arguments. An argument is a variable or a complex term. A complex term is a variable followed by a &quot;such that&quot; symbol &quot;\[ &quot;, followed by a conjunction of one or more propositions? Complex terms 2This notation can be translated into a Russellian notation, with the consequent loss of information about grammatical subordination, by repeated application of the transformation p(z I Q) =~ p(z) A Q. are enclosed in square brackets for readability. Events are represented by event variables, as in \[Hobbs, 1985\], so that see'(el,zl,x~) means el is a seeing event by ,vl of x2.</Paragraph> <Paragraph position="4"> One of sentence (4)'s LFs before attachment-finding is</Paragraph> <Paragraph position="6"> The same LF after attachment-finding is pa~t(\[e, I ~ee'(~l,</Paragraph> <Paragraph position="8"> A paraphrase of the latter LF in English would be something like this: There is an event el that happened in the past; it is a seeing event by xl who is John, of x2 who is the man; something yl is in the park, and that something is either the man or the seeing event; something Y2 is with a telescope, and that something is the park, the man, or the seeing event.</Paragraph> <Paragraph position="9"> The procedure for finding possible attachment sites in order to modify a logical form is as follows. The program recursively descends an LF, and keeps lists of the event and entity variables that initiate complex terms. Event variables associated with tenses are omitted. When the program arrives at some part of the LF that can have multiple attachment sites, it replaces the explicit argument by an existentially quantified variable y, determines whether it can be an event variable, an entity variable, or either, and then encodes the list of possibilities for what y could equal.</Paragraph> </Section> <Section position="2" start_page="237" end_page="237" type="sub_section"> <SectionTitle> 3.2 Eliminating Redundant Logical Forms </SectionTitle> <Paragraph position="0"> In those cases where more than one parse tree, and hence more than one logical form, is produced by DIALOGIC, it is necessary to eliminate redundant readings. In order to do this, once the attachment possibilities are registered, the LFs are flattened (thus losing temporarily the grammatical subordination information), and some simplifying preprocessing is done. Each of the flattened LFs is compared with the others. Any LF that is subsumed by another is discarded as redundant. One LF subsumes another if the two LFs are the same except that the first has a list of possible attachment sites that includes the correspond- null ing list in the second. For example, one LF for sentence (3) says that &quot;with the telescope&quot; can modify either &quot;'saw&quot; or &quot;the man&quot;, and one says that it modifies &quot;saw&quot;. The first LF subsumes the second, and the second is discarded and not compared with any other LFs. Thus, although the LFs are compared pairwise, if all of the ambiguity is due to only one attachment indeterminacy, each LF is looked at only once.</Paragraph> <Paragraph position="1"> Frequently, only some of the alternatives may be thrown out. For Andy said he lost yesterday affer attachment-finding, one logical form allows &quot;yesterday&quot; to be attached to either the saying or the losing, while another attaches it only to the saying. The second is subsumed by the first, and thus discarded. However, there is a third reading in which &quot;yesterday&quot; is the direct object of &quot;lost&quot; and this neither subsumes nor is subsumed by the others and is retained.</Paragraph> </Section> </Section> <Section position="5" start_page="237" end_page="238" type="metho"> <SectionTitle> 4 Lost Information </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="237" end_page="238" type="sub_section"> <SectionTitle> 4.1 Crossing Dependencies </SectionTitle> <Paragraph position="0"> Our attachment-finding routine constructs a logical form that describes all of the standard readings of a sentence, but it also describes some nonstandard readings, namely those corresponding to parse trees with crossing branches, or crossing dependencies. An example would be a reading of (4) in which the seeing was in the park and the man was with the telescope.</Paragraph> <Paragraph position="1"> For small numbers of possible attachment sites, this is an acceptable result. If a sentence is two-ways ambiguous (due just to attachment), we get no wrong readings. If it is five-ways ambiguous on the standard analysis, we get six readings. However, in a sentence with a sequence of four PPs, the standard analysis (and the DIALOGIC parser) get 42 readings, whereas our single disjuncqve LF stands for 120 different readings.</Paragraph> <Paragraph position="2"> Two things can be said about what to do in these cases where the two approaches diverge widely. We could argue that sentences with such crossing dependencies do exist in English. There are some plausible sounding examples.</Paragraph> <Paragraph position="3"> Specify the length, in bytes, of the word.</Paragraph> <Paragraph position="4"> Kate saw a man on Sunday with a wooden leg.</Paragraph> <Paragraph position="5"> In the first, the phrase &quot;in bytes&quot; modifies &quot;specify&quot;, and &quot;of the word&quot; modifies &quot;the length&quot;. In the second, &quot;on Sunday&quot; modifies &quot;saw&quot; and &quot;with a wooden leg&quot; modifies &quot;a man&quot;. Stucky \[1987\] argues that such examples are acceptable and quite frequent.</Paragraph> <Paragraph position="6"> On the other hand, if one feels that these putative examples of crossing dependencies can be explained away and should be ruled out. there is a way to do it within our framework. One can encode in the LFs a crossing-dependencies constraint, and consult that constraint when doing the pragmatic processing.</Paragraph> <Paragraph position="7"> To handle the crossing-dependencies constraint (which we have not yet implemented), the program would need to keep the list of the logical variables it constructs. This list would contain three kinds of variables, event variables, entity variables, and the special variables (the y's in the LFs above) representing attachment ambiguities. The list would keep track of the order in which variables were encountered in descending the LF. A separate list of just the special y variables also needs to be kept. The strategy would be that in trying to resolve referents, whenever one tries to instantiate a y variable to something, the other y variables need to be checked, in accordance with the following constraint: There cannot be Yl, Y2 in the list of y's such that B(yx) < B(y2) < Yl < Y2, where B(y~) is the proposed variable to which yi will be bound or with which it will be coreferential, and the < operator means &quot;precedes in the list of variables&quot;. null This constraint handles a single phrase that has attachment ambiguities. It also works in the case where there is a string of PPs in the subji~ct NP, and then a string of PPs in the object NP, as in The man with the telescope in the park lounged on the bank of a river in the sun.</Paragraph> <Paragraph position="8"> With the appropriate crossing-dependency constraints, the logical form for this would be 3</Paragraph> <Paragraph position="10"> 3We are assuming &quot;with the telescope&quot; and &quot;in the park&quot; can modify the lounging, which they certainly can if we place commas before and after them.</Paragraph> </Section> <Section position="2" start_page="238" end_page="238" type="sub_section"> <SectionTitle> 4.2 Noncoreference Constraints </SectionTitle> <Paragraph position="0"> One kind of information that is provided by the DIALOGIC system is information about coreference and non-coreference insofar as it can be determined from syntactic structure. Thus, the logical form for John saw him.</Paragraph> <Paragraph position="1"> includes the information that &quot;John&quot; and &quot;him&quot; cannot be coreferential. This interacts with our localization of attachment ambiguity. Consider the sentence, John returned Bill's gift to him.</Paragraph> <Paragraph position="2"> If we attach &quot;to him&quot; to &quot;gift&quot;, &quot;him&quot; can be coreferential with &quot;John&quot; but it cannot be coreferential with &quot;Bill&quot;. If we attach it to &quot;returned&quot;, &quot;him&quot; can be coreferential with &quot;Bill&quot; but not with &quot;John&quot;. It is therefore not enough to say that the &quot;subject&quot; of &quot;to&quot; is either the gift or the returning. Each alternative carries its own noncoreference constraints with it. We do not have an elegant solution to this problem. We mention it because, to our knowledge, this interaction of noncoreference constraints and PP attachment has not been noticed by other researchers taking similar approaches.</Paragraph> </Section> </Section> <Section position="6" start_page="238" end_page="239" type="metho"> <SectionTitle> 5 A Note on Literal Meaning </SectionTitle> <Paragraph position="0"> There is an objection one could make to our whole approach. If our logical forms are taken to be a representation of the &quot;literal meaning&quot; of the sentence, then we would seem to be making the claim that the literal meaning of sentence (2) is &quot;Using a telescope, John saw a man, or John saw a man who had a telescope,&quot; whereas the real situation is that either the literal meaning is &quot;Using a telescope, John saw a man,&quot; or the literal meaning is &quot;John saw a man who had a telescope.&quot; The disjunction occurs in the metalanguage, whereas we may seem to be claiming it is in the language.</Paragraph> <Paragraph position="1"> The misunderstanding behind this objection is that the logical form is not intended to represent &quot;literal meaning&quot;. There is no general agreement on precisely what constitutes &quot;literal meaning&quot;, or even whether it is a coherent notion. In any case, few would argue that the meaning of a sentence could be determined on the basis of syntactic information alone. The logical forms produced by the DIALOGIC system are simply intended to encode all of the information that syntactic processing can extract about the sentence. Sometimes the best we can come up with in this phase of the processing is disjunctive information about attachment sites, and that is what the LF records.</Paragraph> <Paragraph position="2"> '~'lO</Paragraph> </Section> class="xml-element"></Paper>