File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/j97-3006_metho.xml
Size: 21,658 bytes
Last Modified: 2025-10-06 14:14:31
<?xml version="1.0" standalone="yes"?> <Paper uid="J97-3006"> <Title>Current Theories of Centering for Pronoun Interpretation: A Critical Evaluation</Title> <Section position="3" start_page="0" end_page="469" type="metho"> <SectionTitle> 2. Overview of Centering </SectionTitle> <Paragraph position="0"> Centering theory is motivated by two related facts about language that are not explained by purely content-based models of reference and coherence (cf. Hobbs \[1979\]): (1) that the coherence of a discourse does not depend only on semantic content but also on the type of referring expressions used, and (2) the existence of garden path effects, in which pronouns appear to be resolved before adequate semantic information has become available: Pronouns and definite descriptions are not equivalent with respect to their effect on coherence. We conjecture that this is so because they * Artificial Intelligence Center, 333 Ravenswood Avenue, Menlo Park, CA 94025. E-mail: kehler@ai.sri.com 1 A draft of GJW, which revised and expanded ideas presented in Grosz, Joshi, and Weinstein (1983), was circulated as far back as 1986. Therefore some of the works described here as extending the work contained therein are dated prior to the published version.</Paragraph> <Paragraph position="1"> (~) 1997 Association for Computational Linguistics Computational Linguistics Volume 23, Number 3 engender different inferences on the part of a hearer or reader. In the most pronounced cases, the wrong choice will mislead a hearer and force backtracking to a correct interpretation. (Grosz, Joshi, and Weinstein 1995, p. 207) GJW exemplify the first of these motivations with passages (1) and (2). Passage (1) is presumed to be in a longer segment that is currently centered on John.</Paragraph> <Paragraph position="2"> (1) a. He has been acting quite odd. (He=John) b. He called up Mike yesterday.</Paragraph> <Paragraph position="3"> c. John wanted to meet him quite urgently.</Paragraph> <Paragraph position="4"> The third sentence in this passage is quite odd, presumably because the more central element (John) is not referred to with a pronoun whereas the less central element (Mike) is. This passage can be compared to the similar passage in (2).</Paragraph> <Paragraph position="5"> (2) a. He has been acting quite odd. (He=John) b. He called up Mike yesterday.</Paragraph> <Paragraph position="6"> c. He wanted to meet him quite urgently.</Paragraph> <Paragraph position="7"> Although the propositional content expressed by these two passages is the same (the only difference being the expression used to refer to John in the subject of the third sentence), passage (2) is not jarring in the way that (1) is.</Paragraph> <Paragraph position="8"> GJW exemplify the second of these motivations with passage (3).</Paragraph> <Paragraph position="9"> (3) a. Terry really goofs sometimes.</Paragraph> <Paragraph position="10"> b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.</Paragraph> <Paragraph position="11"> c. He wanted Tony to join him on a sailing expedition.</Paragraph> <Paragraph position="12"> d. He called him at 6AM.</Paragraph> <Paragraph position="13"> e. He was sick and furious at being woken up so early.</Paragraph> <Paragraph position="14"> Sentence (3e) causes the hearer to be misled: whereas common sense considerations indicate that the intended referent for He is Tony, hearers tend to initially assign Terry as its referent. Such examples suggest that more is involved in pronoun interpretation than simply reasoning about semantic plausibility. In particular, they suggest that hearers assign referents to pronouns before interpreting the remainder of the sentence. Details of Centering. In GJW's centering theory, each utterance Un in a discourse has exactly one backward-looking center (denoted Co(U,z)) and a partially ordered set of forward-looking centers (denoted Cf(Un)). Roughly speaking, Cf(U~) contains all entities referred to in U~; among these is Cb(Un). Following Brennan, Friedman, and Pollard (1987), we refer to the highest-ranked forward-looking center as Cp(U,). 2 Cb(Un+D 2 The issues pertaining to how the ordering of entities in Cf(U~) is determined have not been completely resolved. For the examples discussed in this paper, we can use the hierarchy of grammatical relations given by Brennan, Friedman, and Pollard (1987), in which the grammatical subject is ranked above all other grammatical relations (object, object2, and so forth).</Paragraph> <Paragraph position="15"> Kehler Centering for Pronoun Interpretation is by definition the most highly ranked element of Cf(U,) realized in U~+I. Three intersentential relationships between a pair of utterances U, and Un+l are defined:</Paragraph> <Paragraph position="17"> The following rules are proposed in GJW: Rule 1 If any element of Cf(Un) is realized by a pronoun in utterance Un+l, then Cb(Un+l) must be realized as a pronoun also.</Paragraph> <Paragraph position="18"> Rule 2 Sequences of continuations are preferred over sequences of retaining; and sequences of retaining are to be preferred over sequences of shifting. The use of Rule 1 is illustrated by the oddness of passage (1) as compared to passage (2), because in (lc) the Cb (John) is not pronominalized whereas a non-Cb (Mike) is. The examples GJW give to illustrate Rule 2 are shown in passages (4) and (5).</Paragraph> <Paragraph position="19"> (4) a. John went to his favorite music store to buy a piano. b. He had frequented the store for many years.</Paragraph> <Paragraph position="20"> c. He was excited that he could finally buy a piano.</Paragraph> <Paragraph position="21"> d. He arrived just as the store was closing for the day. (5) a. John went to his favorite music store to buy a piano. b. It was a store John had frequented for many years.</Paragraph> <Paragraph position="22"> c. He was excited that he could finally buy a piano.</Paragraph> <Paragraph position="23"> d. It was closing just as John arrived.</Paragraph> <Paragraph position="24"> Like passages (1) and (2), passages (4) and (5) express the same propositional content, yet they are not equally coherent. Whereas passage (4) consists of a sequence of Continue relations centered on John, passage (5) is marked by movements between Continuing and Retaining, which gives the effect that the passage flips back-and-forth between being about John and being about his favorite music store. Rule 1 is presented as a constraint on center realization, and Rule 2 as a constraint on center movement. As formulated, the predictions these rules make about the preferred referents of pronouns are fairly limited. 3 For instance, Rule 1 makes no 3 GJW do not make any specific proposals for using Rules 1 and 2 for pronoun interpretation. In Section 3, we discuss a particular utilization of these rules for pronoun interpretation proposed by Brennan, Friedman, and Pollard (1987). An apparently popular misconception attributes this utilization to GJW, however neither the draft nor final versions of GJW put forth such a proposal. See also GJW (1995, p. 215, footnote 16).</Paragraph> <Paragraph position="25"> Computational Linguistics Volume 23, Number 3 Table 1 Transitions in the BFP algorithm.</Paragraph> <Paragraph position="27"> predictions about the preferred referents of the pronouns in sentence (3d), nor does it predict the garden path effect in sentence (3e); in each case the rule is satisfied assuming either possible assignment of referents to the pronouns. 4</Paragraph> </Section> <Section position="4" start_page="469" end_page="473" type="metho"> <SectionTitle> 3. The BFP Algorithm </SectionTitle> <Paragraph position="0"> Brennan, Friedman, and Pollard (1987, henceforth BFP) describe an algorithm for pronoun interpretation based on centering principles, which is also utilized in Walker, Iida, and Cote (1994, henceforth WIC). In addition to Rule 1, BFP utilize Rule 2 in making predictions for pronominal reference. They augment the transition hierarchy by replacing the Shift transition with two transitions, termed Smooth-Shift and Rough-Shift, which are differentiated on the basis of whether or not Cb (U,+I) is also Cp (U,+I).5</Paragraph> <Paragraph position="2"> They redefine Rule 2 as follows: Rule 2 Transition states are ordered. CONTINUE is preferred to RETAIN is preferred to SMOOTH-SHIFT is preferred to ROUGH-SHIFT.</Paragraph> <Paragraph position="3"> The resulting transition definitions are summarized in Table 1. Given these definitions, their algorithm (as described in WIC) is defined as follows: The pronominal referents that get assigned are those which yield the most preferred relation in Rule 2, assuming Rule 1 and other coreference constraints (gender, number, syntactic, semantic type of predicate arguments) are not violated. This strategy 4 A case in which Rule 1 does make a prediction is given in example (i); assigning Sam as the referent of he causes a violation whereas assigning John does not.</Paragraph> <Paragraph position="4"> (i) a. John introduced Bill to Sam.</Paragraph> <Paragraph position="5"> b. He seemed to like Bill.</Paragraph> <Paragraph position="6"> I thank an anonymous reviewer for bringing this example to my attention. 5 The terms Smooth-Shift and Rough-Shift were introduced in WIC. Kehler Centering for Pronoun Interpretation correctly predicts that He and him in sentence (3d) refer to Terry and Tony respectively, since this assignment results in a Continue relation whereas the Tony/Terry assignment results in a less-preferred Retain relation. Their rules also account for the oddness of sentence (3e), since assigning he to Tony results in a Smooth-Shift whereas assigning he to Terry results in a Continue. Therefore, the algorithm makes the correct predictions regarding example (3), one of the central motivating examples of centering theory.</Paragraph> <Paragraph position="7"> Problems with the BFP Algorithm. The fact that the BFP algorithm predicts the garden path effect exhibited by sentence (3e) is particularly indicative that it embodies the motivations for centering theory. As we noted in Section 2, such effects distinguish centering-based approaches from purely content-based models of reference and coherence (Hobbs 1979, inter alia). As Brennan (1995) explains: While knowledge-based theories often succeed in resolving referring expressions in this manner \[=using semantic information and world knowledge, without taking advantage of the kinds of syntactic constraints that centering uses\], they do not model human discourse processing. An entirely knowledge-based algorithm would not reproduce an addressee's immediate tendency to interpret a pronoun as cospecifying the backward center, even when this results in an implausible interpretation. (Brennan 1995, p. 145) However, other examples demonstrate that the BFP algorithm also cannot model an addressee's immediate tendency to interpret a pronoun, and therefore cannot properly account for the pronoun interpretation preferences that result from such tendencies. To illustrate, we consider a modification to passage (3), shown in passage (6), with three possible follow-ons (6el-e3).</Paragraph> <Paragraph position="8"> (6) a. Terry really gets angry sometimes.</Paragraph> <Paragraph position="9"> b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.</Paragraph> <Paragraph position="10"> c. He wanted Tony to join him on a sailing expedition, and left him a message on his answering machine. \[Cb=Cp=Terry\] d. Tony called him at 6AM the next morning. \[Cb=Terry, C,=Tony\] el. He was furious for being woken up so early.</Paragraph> <Paragraph position="11"> e2. He was furious with him for being woken up so early.</Paragraph> <Paragraph position="12"> e3. He was furious with Tony for being woken up so early.</Paragraph> <Paragraph position="13"> Sentence (6d) constitutes a Retain, in which CF(U6d) is Tony and Cb(U6d) is Terry. Retains often result in an ambiguity based on whether a subsequent subject pronoun refers to Cb(U,,) (resulting in a Continue) or to C,(U,) (resulting in a Smooth-Shift). While the subject pronouns in follow-ons (6e1-e3) may all display this ambiguity to a certain degree, the preferences associated with them appear to be consistent among the three variants. 6 That is, the initial preference for the subject pronominal He in sentence 6 The author and several informants prefer the subject pronoun to refer to Tony initially, causing a garden path effect in each case. Aside from this, there may be a subtle processing difference between him in variant (6e2) and with Tony in variant (6e3). This accords with the observation that hearers have an immediate tendency to resolve subject pronouns based on the existing discourse state, before the entire sentence is interpreted.</Paragraph> <Paragraph position="14"> Within the BFP algorithm, however, the ways in which these follow-ons are analyzed differ radically, as summarized in Table 2. In follow-on (6e0, assigning He=Terry results in a Continue whereas assigning He=Tony results in a Smooth-Shift, and so Terry is preferred. In follow-on (6e2), assigning He=Terry results in a Rough-Shift whereas assigning He=Tony again results in a Smooth-Shift, and so Tony is preferred. The reason for this difference is attributable solely to the fact that the pronoun him occurs in (6e2): because there are two non-coreferring pronouns in (6e2), one must refer to Tony, and because Tony is Cp(U6d), by definition Tony is Cb(U6e2) instead of Terry. Finally, in sentence (6e3), the assignment of He=Terry results in a Rule 1 violation--the Cb Tony is not pronominalized whereas Terry is--putting it in the company of highly awkward examples such as passage (1). If we ignore this violation, the resulting transition is again a Rough-Shift, the lowest-ranked relation. (The assignment of He=Tony is ruled out by a syntactic constraint violation.) These varied results are inconsistent with the aforementioned facts concerning these passages in both empirical and theoretical respects. Empirically, the results are counter to the more consistent preferences associated with the subject pronouns in each case. Theoretically, such consistency is just what one would expect given a hearer's immediate tendency to resolve subject pronouns based on the existing discourse state. In either regard, it is unclear why the inclusion of the phrases with him in variant (6e2) and with Tony in variant (6e3) should lead to such varied predictions for the subject pronoun. In fact, the example illustrates a general property of the BFP algorithm: that the preferred assignment for a pronoun in such examples, even in subject position, cannot be determined until the entire sentence has been processed. This property results from the fact that determining the transition type between a pair of utterances Un and Un+l requires the identification of Cb(Un+l), and a noun phrase (pronominal or not) can occur at any point in the utterance that will alter the assignment of Cb(Un+l). This is what occurs in the analysis of passage (6): whereas the Cb of sentence (6el) is these sentences in that any garden path in sentence (6e3) may be resolved earlier than in (6el) and (6e2), specifically, at the point at which Tony is reached. This is a result of the fact that syntactic constraints on coreference can be used to eliminate the possibility of He referring to Tony at that time, whereas in the other cases it is semantic information that comes later in the sentence that eliminates Tony as a referent. Kehler Centering for Pronoun Interpretation Terry assuming He refers to Terry, the occurrence of him later in the sentence in (6e2) and similarly Tony in (6e3) causes the Cb to be Tony, thus changing the bindings that constitute the various transition possibilities, and in this case, the predicted preferred referents. To be clear, this is not an issue regarding the efficiency nor the cognitive reality of BFP's particular algorithm; in fact neither BFP nor WIC make any claims to these effects. The problem lies more generally in their proposal to utilize Rule 2 along with the definition of Cb(Un+l) to interpret pronouns--any algorithm incorporating this proposal will have to process an entire sentence before determining the preferred referents of pronouns; no reordering of processing within the BFP algorithm can alter this fact. The need to process an entire sentence to recover pronoun assignments, however, is one that GJW and Brennan (1995) argue against in motivating centering over purely content-based models of reference and coherence. That is, this very property renders such an approach incapable of modeling the preferences associated with an addressee's immediate tendency to interpret pronouns, as example (6) demonstrates. 7 Preferences and Other Intersentential Relationships. The motivations for centering cited by GJW and Brennan (1995) reflect the intuition that salience plays a central role in pronoun interpretation. What remains at issue is the manner in which salience is utilized by the pronoun interpreter. In the previous section we argued that BFP's use of Rule 2 along with the transition definitions and definition of Cb does not provide the correct utilization. In fact, the only aspects of Un and Un+ 1 utilized by the BFP algorithm are the identities of Cb(U,), Cp(Un), Cb(Un+l), and Cp(U~+I), as well as the types of expressions used to refer to them. Here, we argue that this is also insufficient. There is a well-known contrast between passages that are coherent by virtue of being a narration, as is the case for sentence (7c) and follow-on (7d), versus those coherent by virtue of parallelism, as is the case for sentence (7c) and follow-on (7d'). (7) a. The three candidates had a debate today.</Paragraph> <Paragraph position="15"> b. Bob Dole began by bashing Bill Clinton.</Paragraph> <Paragraph position="16"> c. He criticized him on his opposition to tobacco.</Paragraph> <Paragraph position="17"> d. Then Ross Perot reminded him that most Americans are also anti- null tobacco.</Paragraph> <Paragraph position="18"> d'. Then Ross Perot slammed him on his tax policies.</Paragraph> <Paragraph position="19"> The preferred referent for the pronoun in example (7d) is Bob Dole, whereas the preferred referent for the pronoun in example (7d') is Bill Clinton. However, each passage shares sentences (7a-c), and therefore Cp(UTc) and Cb(U7c) are the same for each follow-on. Furthermore, each follow-on contains a new subject (Ross Perot, who will be 7 In order to model this tendency in the BFP algorithm, one might consider a strategy in which provisional referents are assigned to pronouns while proceeding left-to-right in the current utterance. Under such a strategy one could assume that Cb(Un+ 1) is computed incrementally using the assumption that no additional elements will appear in Un+l that are more highly ranked in Cf(Un). Then, garden paths would be predicted when this assumption does not hold and the assignment of Cb (Un+1) must be changed, in addition to those caused by semantic influences such as in sentence (3e). Again, however, this strategy would treat follow-ons (6el) and (6e2) quite differently. This strategy would predict no garden path effect for follow-on (6el), since it assigns Terry as the referent of he and sticks with it. On the other hand, (6e2) should be much worse because two garden paths would be predicted: one for changing Cb(Un+I) from Terry to Tony when the pronoun him is processed, and another for the semantic information subsequently preferring Terry. This difference does not appear to be reflected in the actual judgements for these two examples (in both cases we find a similar garden path effect), although experimental evidence would be required to confirm these judgements. Computational Linguistics Volume 23, Number 3 the new Cp) and an object pronoun (the referent of which will be the new Cb). Therefore, because the relevant Cb and Cp relations are the same, a BFP-style approach cannot distinguish between these cases, s These examples show that pronominal reference preferences are affected by additional types of intersentential relationships that may be identifiable at the time a pronoun is encountered; proposals along these lines include preference-ranking schemes (e.g., Kameyama \[1996\]) and systems in which salience and the process of determining coherence relations interact (e.g., Kehler \[1995\]).</Paragraph> </Section> <Section position="5" start_page="473" end_page="473" type="metho"> <SectionTitle> 4. Conclusions </SectionTitle> <Paragraph position="0"> The pronoun resolution preferences that result from an addressee's immediate tendency to interpret a pronoun motivate pursuing a centering-based approach. However, certain examples demonstrate that BFP's utilization of the centering rules does not model this tendency, which in turn limits the ability of their algorithm to account for the data. Furthermore, data has been presented that shows that in addition to the salience factors utilized by BFP, additional types of intersentential relationships must be taken into account.</Paragraph> </Section> <Section position="6" start_page="473" end_page="473" type="metho"> <SectionTitle> Acknowledgments </SectionTitle> <Paragraph position="0"> The author thanks Barbara Grosz, David Israel, Megumi Kameyama, Christine Nakatani, Gregory Ward, and four anonymous reviewers for helpful comments and discussions. This research was</Paragraph> </Section> class="xml-element"></Paper>