File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/p97-1012_metho.xml
Size: 22,146 bytes
Last Modified: 2025-10-06 14:14:31
<?xml version="1.0" standalone="yes"?> <Paper uid="P97-1012"> <Title>Expectations in Incremental Discourse Processing</Title> <Section position="4" start_page="88" end_page="90" type="metho"> <SectionTitle> 2 Expectations in Corpora </SectionTitle> <Paragraph position="0"> The examples given in the Introduction were all &quot;minimal pairs&quot; created to illustrate the relevant phenomenon as succinctly as possible. Empirical questions thus include: (1) the range of lexico-syntactic constructions that raise expectations with the specific properties mentioned above; (2) the frequency of expectation-raising constructions in text; (3) the frequency with which expectations are satisfied immediately, as opposed to being delayed by material that elaborates the unit raising the expectation; (4) the frequency of embedded expectations; and (5) features that provide evidence for an expectation being satisfied.</Paragraph> <Paragraph position="1"> While we do not have answers to all these questions, a very preliminary analysis of the Brown Corpus, a corpus of approximately 1600 email messages, and a short Romanian text by T. Vianu (approx.</Paragraph> <Paragraph position="2"> 5000 words) has yielded some interesting results.</Paragraph> <Paragraph position="3"> First, reviewing the 270 constructions that Knott has identified as potential cue phrases in the Brown Corpus 1, one finds 15 adverbial phrases (such as &quot;initially&quot;, &quot;at first&quot;, &quot;to start with&quot;, etc.) whose presence in a clause would lead to an expectation being raised. All left-extraposed clauses in English raise expectations (as in Example 4) so all the subordinate conjunctions in Knott's list would be included as well. Outside of cue phrases, we have identified imperative forms of &quot;suppose&quot; and &quot;consider&quot; as raising expectations, but currently lack a more systematic procedure for identifying expectation-raising constructions in text than hand-combing text for them.</Paragraph> <Paragraph position="4"> With respect to how often expectation-raising constructions appear in text, we have Brown Corpus data on two specific types - imperative &quot;suppose&quot; and adverbial &quot;on the one hand&quot; - as well as a detailed analysis of the Romanian text by Vianu mentioned earlier.</Paragraph> <Paragraph position="5"> There are approximately 54K sentences in the Brown Corpus. Of these, 37 contain imperative &quot;suppose&quot; or &quot;let us suppose&quot;. Twelve of these correspond to &quot;what if&quot; questions or negotiation moves which do not raise expectations: Suppose -just suppose this guy was really what he said he was! A retired professional killer If he was just a nut, no harm was done. But if he was the real thing, he could do something about Lolly. (c123) Alec leaned on the desk, holding the clerk's eyes with his. &quot;Suppose you tell me the real reason&quot;, he drawled. &quot;There might be a story in it&quot;. (c121) The remaining 25 sentences constitute only about 0.05% of the Brown Corpus. Of these, 22 have their expectations satisfied immediately (88%) - for example, null Suppose John Jones, who, for 1960, filed on the basis of a calendar year, died June 20, 1961. His return for the period January 1 to June 20, 1961, is due April 16, 1962.</Paragraph> <Paragraph position="6"> One is followed by a single sentence elaborating the original supposition (also flagged by &quot;suppose&quot;) &quot;Suppose it was not us that killed these aliens. Suppose it is something right on the planet, native to it. I just hope it doesn't work on Earthmen too. These critters went real sudden&quot;. (cmO~) while the remaining two contain multi-sentence elaborations of the original supposition. None of the examples in the Brown Corpus contains an embedded expectation.</Paragraph> <Paragraph position="7"> The adverbial &quot;on the one hand&quot; is used to pose a contrast either phrasally Both plans also prohibited common directors, officers, or employees between Du Pont, Christiana, and Delaware, on the one hand, and General Motors on the other. (ch16) You couldn't on the one hand decry the arts and at the same time practice them, could you? (ck08) or clausally. It is only the latter that are of interest from the point of discourse expectations.</Paragraph> <Paragraph position="8"> The Brown Corpus contains only 7 examples of adverbial &quot;on the one hand&quot;. In three cases, the expectation is satisfied immediately by a clause cued by &quot;but&quot; or &quot;or&quot; -e.g.</Paragraph> <Paragraph position="9"> On the one hand, the Public Health Service declared as recently as October 26 that present radiation levels resulting from the Soviet shots &quot;do not warrant undue public concern&quot; or any action to limit the intake of radioactive substances by individuals or large population groups anywhere in the Aj. But the PHS conceded that the new radioactive particles &quot;will add to the risk of genetic effects in succeeding generations, and possibly to the risk of health damage to some people in the United States&quot;.(cb21) In the remaining four cases, satisfaction of the expectation (the &quot;target&quot; contrast item) is delayed by 2-3 sentences elaborating the &quot;source&quot; contrast item -- e.g.</Paragraph> <Paragraph position="10"> Brooklyn College students have an ambivalent attitude toward their school. On the one hand, there is a sense of not having moved beyond the ambiance of their high school. This is particularly acute for those who attended Midwood High School directly across the street from Brooklyn College. They have a sense of marginality at being denied that special badge of status, the out-of-town school. At the same time, there is a good deal of self-congratulation at attending a good college . .. (cf25) In these cases, the target contrast item is cued by &quot;on the other hand&quot; in three cases and &quot;at the same time&quot; in the case given above. Again, none of the examples contains an embedded expectation.</Paragraph> <Paragraph position="11"> (The much smaller email corpus contained six examples of clausal &quot;on the one hand&quot;, with the target contrast cued by &quot;on the other hand&quot;,&quot;on the other&quot; or &quot;at the other extreme&quot;. In one case, there was no explicit target contrast and the expectation raised by &quot;on the one hand&quot; was never satisfied. We will continue to monitor for such examples.) Before concluding with a close analysis of the Romanian text, we should note that in both the Brown Corpus and the email corpus, clausal adverbial &quot;on the other hand&quot; occurs more frequently without an expectation-raising &quot;on the one hand&quot; than it does with one. (Our attention was called to this by a frequency analysis of potential cue phrase instances in the Brown Corpus compiled for us by Alistair Knott and Andrei Mikheev, HCRC, University of Edinburgh.) We found 53 instances of clausal &quot;on the other hand&quot; occuring without an explicit source contrast cued earlier. Although one can only speculate now on the reason for this phenomenon, it does make a difference to incremental analysis, as we try to show in Section 3.3.</Paragraph> <Paragraph position="12"> The Romanian text that has been closely analysed for explicit expectation-raising constructions is T. Vianu's Aesthetics. It contains 5160 words and 382 discourse units (primarily clauses). Counting preposed gerunds as raising expectations as well as counting the constructions noted previously, 39 instances of expectation-raising discourse units were identified (10.2%). In 11 of these cases, 1-16 discourse units intervened before the raised expectation was satisfied. One example follows: Dar de~i trebuie s~-l parcurgem in intregime, pentru a orienta cercetarea este nevoie s~. incerc~m inc~ de pe acum o precizare a obiectului lui.</Paragraph> <Paragraph position="13"> (But although we must cover it entirely, in order to guide the research we need to try already an explanation of its subject matter.) null</Paragraph> </Section> <Section position="5" start_page="90" end_page="93" type="metho"> <SectionTitle> 3 A Grammar for Discourse </SectionTitle> <Paragraph position="0"> The intuitive appeal of Tree-adjoining Grammar (TAG) (Joshi, 1987) for discourse processing (Gardent, 1997; Polanyi and van den Berg, 1996; Schilder, 1997; van den Berg, 1996; Webber, 1991) follows from the fact that TAG's adjoining operation allows one to directly analyse the current discourse unit as a sister to previous discourse material that it stands in a particular relation to. The new intuition presented here - that expectations convey a dependency between the current discourse unit and future discourse material, a dependency that can be &quot;stretched&quot; long-distance by intervening material - more fully exploits TAG's ability to express dependencies. By expressing in an elementary TAG tree, a dependency betwen the current discourse unit and future discourse material and using substitution (Schabes, 1990) when the expected material is found, our TAG-based approach to discourse processing allows expectations to be both raised and resolved.</Paragraph> <Section position="1" start_page="90" end_page="90" type="sub_section"> <SectionTitle> 3.1 Categories and Operations </SectionTitle> <Paragraph position="0"> The categories of our TAG-based approach consist of nodes and binary trees. We follow (Gardent, 1997) in associating nodes with feature structures that may hold various sorts of information, including information about the semantic interpretations projected through the nodes, constraints on the specific operations a node may participate in, etc. A non-terminal node represents a discourse relation holding between its two daughter nodes. A terminal node can be either non-empty (Figure la), corresponding to a basic discourse unit (usually a clause), or empty.</Paragraph> <Paragraph position="1"> A node is &quot;empty&quot; only in not having an associated discourse unit or relation: it can still have an associated feature structure. Empty nodes play a role in adjoining and substitution, as explained below, and hence in building the derived binary tree that represents the structure of the discourse.</Paragraph> <Paragraph position="2"> Adjoining adds to the discourse structure an auxiliary tree consisting of a root labelled with a discourse relation, an empty foot node (labelled *), and at least one non-empty node (Figures lc and ld). In our approach, the foot node of an auxiliary tree must be its leftmost terminal because all adjoining operations take place on a suitably defined right frontier (i.e., the path from the root of a tree to its rightmost leaf node) - such that all newly introduced material lies to the right of the adjunction site. (This is discussed in Section 3.2 in more detail.) Adjoining corresponds to identifying a discourse relation between the new material and material in the previous discourse that is still open for elaboration.</Paragraph> <Paragraph position="3"> Figure 2(a) illustrates adjoining midway down the RF of tree a, while Figure 2(b) illustrates adjoining at the root of a's RF. Figure 2(c) shows adjoining at the &quot;degenerate&quot; case of a tree that consists only of its root. Figure 2(d) will be explained shortly.</Paragraph> <Paragraph position="4"> Substitution unifies the root of a substitution structure with an empty node in the discourse tree that serves as a substitution site. We currently use two kinds of substitution structures: non-empty nodes (Figure la) and elementary trees with substitution sites (Figure lb). The latter are one way by which a substitution site may be introduced into a tree. As will be argued shortly, substitution sites can only appear on the right of an elementary tree, although any number of them may appear there (Figure lb). Figure 2(e) illustrates substitution of a non-empty node at ~, and Figure 2(f) illustrates substitution of an elementary tree with its own substitution site at ~1 Since in a clause with two discourse markers (as in Example 3b) one may look backwards (&quot;for example&quot;) while the other looks forwards (&quot;suppose&quot;), we also need a way of introducing expectations in the context of adjoining. This we do by allowing an auxiliary tree to contain substitution sites (Figure ld) which, as above, can only appear on its right. 2 Another term we use for auxiliary trees is adjoining structures.</Paragraph> </Section> <Section position="2" start_page="90" end_page="90" type="sub_section"> <SectionTitle> 3.2 Constraints </SectionTitle> <Paragraph position="0"> Earlier we noted that in a discourse structure with no substitution sites, adjoining is limited to the right frontier (RF). This is true of all existing TAG-based approaches to discourse processing (Gardent, 1997; Hinrichs and Polanyi, 1986; Polanyi and van den Berg, 1996; Schilder, 1997; Webber, 1991), whose structures correspond to trees that lack substitution sites. One reason for this RF restriction is to maintain a strict correspondence between a left-to-right reading of the terminal nodes of a discourse structure and the text it analyses - i.e., Principle of Sequentiality: A left-to-right reading of the terminal frontier of the tree associated with a discourse must correspond to the span of text it analyses in that same left-to-right order.</Paragraph> <Paragraph position="1"> Formal proof that this principle leads to the restriction of adjoining to the right frontier is given in (Cristea and Webber, June 1997).</Paragraph> <Paragraph position="2"> The Principle of Sequentiality leads to additional constraints on where adjoining and substitution can occur in trees with substitution sites. Consider the tree in Figure 3(i), which has two such sites, and an adjoining operation on the right frontier at node Rj or above. Figure 3(it) shows that this would introduce a non-empty node (uk) above and to the right of the substitution sites. This would mean that later substitution at either of them would lead to a violation of the Principle of Sequentiality, since the newly ~We currently have no linguistic evidence for the structure labelled ~ in Figure ld, but are open to its possibility.</Paragraph> <Paragraph position="3"> substituted node u~+t would then appear to the left of uk in the terminal frontier, but to the right of it in the original discourse. Adjoining at any node above Rj+2 - the left sister of the most deeply embedded substitution site - leads to the same problem (Figure 3iii). Thus in a tree with substitution sites, adjoining must be limited to nodes on the path from the left sister of the most embedded site to that sister's rightmost descendent. But this is just a right frontier (RF) rooted at that left sister. Thus, adjoining is always limited to a RF: the presence of a substitution site just changes what node that RF is rooted at. We can call a RF rooted at the left sister of the most embedded substitution site, the inner right frontier or &quot;inner_RF&quot;. (In Figure 3(i), the inner_RF is indicated by a dashed arrow.) In contrast, we sometimes call the RF of a tree without substitution sites, the outer right frontier or &quot;outer_RF&quot;. Figure 2(d) illustrates adjoining on the inner_RF of a, a tree with a substitution site labelled h.</Paragraph> <Paragraph position="4"> Another consequence of the Principle of Sequentiality is that the only node at which substitution is allowed in a tree with substitution sites is at the most embedded one. Any other substitution would violate the principle. (Formal proof of these claims are given in (Cristea and Webber, June 1997).</Paragraph> </Section> <Section position="3" start_page="90" end_page="93" type="sub_section"> <SectionTitle> 3.3 Examples </SectionTitle> <Paragraph position="0"> Because we have not yet implemented a parser that embodies the ideas presented so far, we give here an idealized analysis of Examples 2 and 3, to show how an ideal incremental monotonic algorithm that admitted expectations would work.</Paragraph> <Paragraph position="1"> Figure 4A illustrates the incremental analysis of Example 2. Figure 4A(i) shows the elementary tree corresponding to sentence 2a (&quot;On the one hand ...&quot;): the interpretation of &quot;John is very generous&quot; I corresponds to the left daughter labelled &quot;a&quot;. The adverbial &quot;On the one hand&quot; is taken as signalling a coherence relation of Contrast with something expected later in the discourse.</Paragraph> <Paragraph position="2"> In sentence 2b (&quot;On the other hand, suppose ...&quot;), the adverbial &quot;On the other hand&quot; signals the expected contrast item. Because it is already expected, the adverbial does not lead to the creation of a separate elementary tree (but see the next example). The imperative verb &quot;suppose&quot;, however, signals a coherence relation of antecedent/consequent (A/C) with a consequence expected later in the discourse. The elementary tree corresponding to &quot;suppose ...&quot; is shown in Figure 4A(ii), with the interpretation of &quot;you need money&quot; corresponding to the left daughter labelled &quot;b&quot;. Figure 4A(iii) shows this elementary tree substituted at ~1, satisfying that expectation. Figure 4A(iv) shows the interpretation of sentence 2c (&quot;You'd see he's very difficult to find&quot;) substituted at 12, satisfying that remaining expectation.</Paragraph> <Paragraph position="3"> Before moving on to Example 3, notice that if Sentence 2a were not explicitly cued with &quot;On the other hand&quot;, the analysis would proceed somewhat differently. null Example 5 a. John is very generous.</Paragraph> <Paragraph position="4"> b. On the other hand, suppose you needed money.</Paragraph> <Paragraph position="5"> c. You'd see that he's very difficult to find.</Paragraph> <Paragraph position="6"> Here, the interpretation of sentence 5(a) would correspond to the degenerate case of a tree consisting of a single non-empty node shown in Figure 4B(i). The contrast introduced by &quot;On the other hand&quot; in sentence 5(b) leads to the auxiliary tree shown in Figure 4B(ii), where T stands for the elementary tree corresponding to the interpretation of &quot;suppose...&quot;. The entire structure associated with sentence 5(b) is shown in Figure 4B(iii). This is adjoined to the single node tree in Figure 4B(i), yielding the tree shown in Figure 4B(iv). The analysis then continues exactly as in that of Example 2 above.</Paragraph> <Paragraph position="7"> Moving on to Example 3, Figure 4C(i) shows the same elementary tree as in Figure 4A(i) corresponding to clause 3a. Next, Figure 4C(ii) shows the auxiliary tree with substitution site ~2 corresponding to clause 3b being adjoined as a sister to the interpretation of clause 3a, as evidence for the claim made there. The right daughter of the node labelled &quot;Evidence&quot; is, as in Example 2b, an elementary tree expecting the consequence of the supposition &quot;you need money&quot;. Figure 4C(iii) shows the interpretation of clause 3c substituted at ~2, satisfying that expectation. Finally, Figure 4C(iv) shows the interpretation of clause 3d substituted at 11, satisfying the remaining expectation.</Paragraph> </Section> </Section> <Section position="6" start_page="93" end_page="94" type="metho"> <SectionTitle> 4 Sources of Uncertainty </SectionTitle> <Paragraph position="0"> The idealized analysis presented above could lead to a simple deterministic incremental algorithm, if there were no uncertainty due to local or global ambiguity. But there is. We can identify three separate sources of uncertainty that would affect incremental processing according to the grammar just presented: * the identity of the discourse relation that is meant to hold between two discourse units; * the operation (adjoining or substitution) to be used in adding one discourse unit onto another; * if that operation is adjoining, the site in the target unit at which the operation should take place - that is, the other argument to the discourse relation associated with the root of the auxiliary tree.</Paragraph> <Paragraph position="1"> It may not be obvious that there could be uncertainty as to whether the current discourse unit satisfies an expectation and therefore substitutes into the discourse structure, or elaborates something in the previous discourse, and therefore adjoins into it. 3 But the evidence clarifying this local ambiguity may not be available until later in the discourse. In the following variation of Example 4, the fact that clause (b) participates in elaborating the interpretation of clause (a) rather than in satisfying the expectation it raises (which it does in Example 4) may not be unambiguously clear until the discourse marker &quot;for example&quot; in clause (c) is processed.</Paragraph> <Paragraph position="2"> Example 6 a. Because John is such a generous man b. whenever he is asked for money, c. he will give whatever he has, for example d. he deserves the &quot;Citizen of the Year&quot; award. The other point is that, even if a forward-looking cue phrase signals only a substitution structure as in Figure 4A(i) and 4A(ii), if there are no pending subsitution sites such as ~1 in 4A(i) against which to unify such a structure, then the substitution structure must be coerced to an auxiliary tree as in Figure ld (with some as yet unspecified cohesion relation) in order to adjoin it somewhere in the current discourse structure.</Paragraph> </Section> class="xml-element"></Paper>