File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/j99-3001_metho.xml

Size: 75,235 bytes

Last Modified: 2025-10-06 14:15:18

<?xml version="1.0" standalone="yes"?>
<Paper uid="J99-3001">
  <Title>Functional Centering Grounding Referential Coherence in Information Structure</Title>
  <Section position="5" start_page="313" end_page="314" type="metho">
    <SectionTitle>
3. The Centering Model
</SectionTitle>
    <Paragraph position="0"> The centering model (Grosz, Joshi, and Weinstein 1983, 1995) is intended to describe the relationship between local coherence and the use of referring expressions. The model requires two constructs, a single backward-looking center and a list of forward-looking centers, as well as a few rules and constraints that govern the interpretation of centers. It is assumed that discourses are composed of constituent segments (Grosz and Sidner 1986), each of which consists of a sequence of utterances. Each utterance Ui in a given discourse segment DS is assigned a list of forward-looking centers, Cf(DS, Ui), and a unique backward-looking center, Cb(DS, Ui). The forward-looking centers of Ui depend only on the discourse entities that constitute the ith utterance; previous utterances provide no constraints on Cf(DS, Ui). A ranking imposed on the elements of the Cf reflects the assumption that the most highly ranked element of Cf(DS, Ui), the preferred center Cp(DS, Ui), will most likely be the Cb(DS, Ui+l). The most highly ranked element of Cf(DS, Ui) that is finally realized in Ui+l (i.e., is associated with an expression that has a valid interpretation in the underlying semantic representation) is the actual Cb(DS, Ui+I). Since in this paper we will not discuss the topics of global coherence and discourse macro segmentation (for recent treatments of these issues, see Hahn and Strube \[1997\] and Walker \[1998\]), we assume a priori that any centering data structure is assigned an utterance in a given discourse segment and simplify the notation of centers to Cb(Ui) and Cf(Ui).</Paragraph>
    <Paragraph position="1"> Grosz, Joshi, and Weinstein (1995) state that the items in the Cf list have to be ranked according to a number of factors including grammatical role, text position, and lexical semantics. As far as their discussion of concrete English discourse phenomena is concerned, they nevertheless restrict their ranking criteria to those solely based on grammatical roles, which we repeat in Table 1.</Paragraph>
    <Paragraph position="2"> The centering model, in addition, defines transition relations across pairs of adjacent utterances (Table 2). These transitions differ from each other according to whether backward-looking centers of successive utterances are identical or not, and, if they are identical, whether they match the most highly ranked element of the current forward-looking center list, the Cp(Ui), or not.</Paragraph>
    <Paragraph position="3"> Grosz, Joshi, and Weinstein (1995) also define two rules on center movement and realization:  Sequences of continuation are to be preferred over sequences of retaining; and sequences of retaining are to be preferred over sequences of shifting.</Paragraph>
    <Paragraph position="4"> Rule 1 states that no element in an utterance can be realized by a pronoun unless the backward-looking center is realized by a pronoun, too. This rule is intended to capture one function of the use of pronominal anaphors--a pronoun in the Cb signals to the hearer that the speaker is continuing to refer to the same discourse. Rule 2 should reflect the intuition that a pair of utterances that have the same theme is more coherent than another pair of utterances with more than one theme. The theory claims, above all, that to the extent that a discourse adheres to these rules and constraints, its local coherence will increase and the inference load placed upon the hearer will decrease.</Paragraph>
    <Paragraph position="5"> The basic unit for which the centering data structures are generated is the utterance U. Since Grosz, Joshi, and Weinstein (1995) and Brennan, Friedman, and Pollard (1987) do not give a reasonable definition of utterance, we follow Kameyama's (1998) method for dividing a sentence into several center-updating units (Figure 1). Her intrasentential centering mechanisms operate at the clause level. While tensed clauses are defined as utterances on their own, untensed clauses are processed with the main clause so that the Cf list of the main clause contains the elements of the untensed embedded clause. Kameyama further distinguishes, for tensed clauses, between sequential and hierarchical centering. Except for direct and reported speech (embedded and inaccessible to the superordinate level), nonreport complements, and relative clauses (both embedded but accessible to the superordinate level; less salient than the higher levels), all other types of tensed clauses build a chain of utterances at the same level.</Paragraph>
    <Section position="1" start_page="314" end_page="314" type="sub_section">
      <SectionTitle>
3.1 A Centering Algorithm for Anaphora Resolution
</SectionTitle>
      <Paragraph position="0"> Though the centering model was not originally intended to be used as a blueprint for anaphora resolution, 5 several applications tackling this problem have made use of</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="314" end_page="317" type="metho">
    <SectionTitle>
5 Aravind Joshi, personal communication.
</SectionTitle>
    <Paragraph position="0"> 1. If a pronoun in Ui is encountered, test the elements of the Cf(Ui-1) in the given order until an element under scrutiny satisfies all the required morphosyntactic, binding, and sortal criteria. This element is chosen as the antecedent of the pronoun.</Paragraph>
    <Paragraph position="1"> 2. When utterance Ui is completely read, compute Cb(Ui) and generate Cf(Ui); rank the  elements according to agreed-upon preference criteria (such as the ones from Table 1). the model, nevertheless. One interpretation is due to Brennan, Friedman, and Pollard (1987) who utilize Rule 2 for computing preferences for antecedents of pronouns (see Section 3.2). In this section, we will specify a simple algorithm that uses the Cf list directly for providing preferences for the antecedents of pronouns.</Paragraph>
    <Paragraph position="2"> The algorithm (which we will refer to as the basic algorithm; Table 3) consists of two steps, which are triggered independently.</Paragraph>
    <Paragraph position="3"> We may illustrate this algorithm by referring to the text fragment in example (2): 6 Example 2 a. The sentry was not dead.</Paragraph>
    <Paragraph position="4"> b. He was, in fact, showing signs of reviving ...</Paragraph>
    <Paragraph position="5"> c. He was partially uniformed in a cavalry tunic.</Paragraph>
    <Paragraph position="6"> d. Mike stripped this from him and donned it.</Paragraph>
    <Paragraph position="7"> e. He tied and gagged the man ....</Paragraph>
    <Paragraph position="8"> Table 4 gives the centering analysis for this text fragment using the algorithm from Table 3. 7 Since (2a) is the first sentence in this fragment, it has no Cb. In (2b) ~ and in (2c) the discourse entity SENTRY is referred to by the personal pronoun he. Since we assume a Cf ranking by grammatical roles in this example, SENTRY is ranked highest in these sentences (the pronoun always appears in subject position). In (2d), the discourse entity MIKE is introduced by a proper name in subject position. The pronoun him is resolved to the most highly ranked element of Cf(2c), namely SENTRY. Since Mike occupies the subject position, it is ranked higher in the Cf(2d) than SENTRY. Therefore the pronoun he in (2e) can be resolved correctly to MIKE.</Paragraph>
    <Paragraph position="9"> This example not only illustrates anaphora resolution using the basic algorithm from Table 3 but also incorporates the application of Rule 1 of the centering model. (2d) contains the pronoun him, which is the Cb of this utterance. In (2e), the Cb is also realized as a pronoun while SENTRY is realized by the definite noun phrase the man, which is allowed by Rule 1.</Paragraph>
    <Section position="1" start_page="315" end_page="317" type="sub_section">
      <SectionTitle>
3.2 The BFP Algorithm
</SectionTitle>
      <Paragraph position="0"> The centering algorithm described by Brennan, Friedman, and Pollard (1987, henceforth BFP algorithm) interprets the centering model in a certain way and applies it to the resolution of pronouns. The most obvious difference between Grosz, Joshi, and 6 With slight simplifications taken from the Brown Corpus cn03. 7 In the subsequent tables illustrating centering data, discourse entities, a notion at the representational level, are denoted by SMALLCAPS and appear on the left side of the colon, while the corresponding surface expressions, at the level of linguistic data, appear on the right side of the colon.</Paragraph>
      <Paragraph position="1">  the basic centering algorithm.</Paragraph>
      <Paragraph position="2"> (2a) The sentry was not dead.</Paragraph>
      <Paragraph position="3"> Cb: -Cf: \[SENTRY: sentry\] (2b) He was, in fact, showing signs of reviving ...</Paragraph>
      <Paragraph position="4"> Cb: SENTRY: he Cf: \[SENTRY: he, SIGNS: signs\] (2c) He was partially uniformed in a cavalry tunic.</Paragraph>
      <Paragraph position="5"> Cb: SENTRY: he Cf: \[SENTRY: he, TUNIC: tunic\] (2d) Mike stripped this from him and donned it.</Paragraph>
      <Paragraph position="6"> Cb: SENTRY: him Cf: \[MIKE: Mike, TUNIC: this, it, SENTRY: him\] (2e) He tied and gagged the man ....</Paragraph>
      <Paragraph position="7"> Cb: MIKE: he  Cf: \[MIKE: he, SENTRY: the man\] Table 5 Transition types according to BFP.</Paragraph>
      <Paragraph position="9"> Weinstein (1983, 1995) and Brennan, Friedman, and Pollard (1987) is that the latter use two SHIFT transitions instead of only one: SMOOTH-SHIFT 8 requires the Cb(Ui) to equal Cp(Ui), while ROUGH-SHIFT requires inequality (Table 5). Brennan, Friedman, and Pollard (1987) also allow the Cb(Ui_I) to remain undefined.</Paragraph>
      <Paragraph position="10"> Brennan, Friedman, and Pollard (1987) extend the ordering constraints in Cf in the following way: &amp;quot;We rank the items in Cf by obliqueness of grammatical relations of the subcategorized functions of the main verb: that is, first the subject, object, and object2, followed by other subcategorized functions, and finally, adjuncts.&amp;quot; (p. 156). In order to apply the centering model to pronoun resolution, they use Rule 2 in making predictions for pronominal reference and redefine the rules as follows (quoting Walker, Iida, and Cote \[1994\]): Rule 1' If some element of Cf(Ui-1) is realized as a pronoun in Ui, then so is Cb(Ui).</Paragraph>
      <Paragraph position="11"> 8 Brennan, Friedman, and Pollard (1987) call these transitions SHIFTING and SHIFTING-1. The more figurative names were introduced by Walker, Iida, and Cote (1994).</Paragraph>
      <Paragraph position="12">  Computational Linguistics Volume 25, Number 3 Table 6 BFP-algorithm.</Paragraph>
      <Paragraph position="13">  1. Generate possible Cb-Cf combinations. In this step, all (plausible and implausible) assignments of pronouns to elements of the previous Cf are computed. 2. Filter by constraints, e.g., contra-indexing, sortal predicates, centering rules and constraints. This way, possible antecedents are filtered out because of morphosyntactic, binding, and semantic criteria. Also the realization of noun phrases in the current utterance (e.g., realization as a pronoun vs. realization as a definite noun phrase or proper name) comes into play.</Paragraph>
      <Paragraph position="14"> 3. Rank by transition orderings. This is the step, where the pragmatic constraints of centering apply. Basically, CONTINUE transitions are preferred, i.e., the antecedent of a pronoun is more likely to turn up as the Cb of the previous utterance than any other element of the Cf. In certain configurations, the algorithm includes a preference for  parallelism in linguistic constructions.</Paragraph>
      <Paragraph position="15"> Rule 2 ~ Transition states are ordered. CONTINUE is preferred to RETAIN is preferred to SMOOTH-SHIFT is preferred to ROUGH-SHIFT.</Paragraph>
      <Paragraph position="16"> Their algorithm (Table 6) consists of three basic steps (as described by Walker, Iida, and Cote \[1994\]). 9 In order to illustrate this algorithm, we use example (2) from above and supply the corresponding Cb/Cf data in Table 7. Let us focus on the interpretation of utterance (2e) where the centering data diverges when one compares the basic and the BFP algorithms. After step 2 (filtering), the algorithm has produced two readings, which are rated by the corresponding transitions in step 3. Since SMOOTH-SHIFT is preferred over ROUGH-SHIFT, the pronoun he is resolved to MIKE, the highest-ranked element of Cf(2d). Also, Rule 1 would be violated in the rejected reading.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="317" end_page="324" type="metho">
    <SectionTitle>
4. Principles of Functional Centering
</SectionTitle>
    <Paragraph position="0"> The crucial point underlying functional centering is to relate the ranking of the forward-looking centers and the information structure of the corresponding utterances. Hence, a proper correspondence relation between the basic centering data structures and the relevant functional notions has to be established and formally rephrased in terms of the centering model. In this section, we first discuss two studies in which the information structure of utterances is already integrated into the centering model (Rambow 1993; Hoffman 1996, 1998). Using these proposals as a point of departure, we shall develop our own proposal--functional centering (Strube and Hahn 1996).</Paragraph>
    <Section position="1" start_page="317" end_page="319" type="sub_section">
      <SectionTitle>
4.1 Integrating Information Structure and Centering
</SectionTitle>
      <Paragraph position="0"> As far as the centering model is concerned, the first account involving information structure criteria was given by Kameyama (1986) and further refined by Walker, Iida, and Cote (1994) in their study on the use of zero pronouns and topic mark- null 9 Walker, Iida, and Cote (1994) note that it is possible to improve the computational efficiency of the algorithm by interleaving generating, filtering, and ranking steps; cf. the version of the algorithm described by Walker (1998).</Paragraph>
      <Paragraph position="1">  Centering analysis for the text fragment in example (2) according to the BFP algorithm.</Paragraph>
      <Paragraph position="2"> (2a) The sentry was not dead.</Paragraph>
      <Paragraph position="3"> Cb: -Cf: \[SENTRY: sentry\] (2b) He was, in fact, showing signs of reviving ...</Paragraph>
      <Paragraph position="4"> Cb: SENTRY: he CONTINUE Cf: \[SENTRY: he, SIGNS: signs\] (2c) He was partially uniformed in a cavalry tunic.</Paragraph>
      <Paragraph position="5"> Cb: SENTRY: he CONTINUE Cf: \[SENTRY: he, TUNIC: tunic\] (2d) Mike stripped this from him and donned it.</Paragraph>
      <Paragraph position="6"> Cb: SENTRY: him RETAIN Cf: \[MIKE: Mike, TUNIC: this, it, SENTRY: him\] (2e) He tied and gagged the man ....</Paragraph>
      <Paragraph position="7"> Cb: MIKE: he SMOOTH-SHIFT Cf: \[MIKE: he, SENTRY: the man\] Cb: ...... +-vJ.Jr,.r~. th6 iii~iz KGUGH-SHIFT LOm,~N J-l~t,l. ItC/~ l &amp;VllhfJ. ~ll.G lltl,C/ll,\] ers in Japanese. This led them to augment the grammatical ranking conditions for the  forward-looking centers by additional functional notions. A deeper consideration of information structure principles and their relation to the centering model has been proposed in two studies concerned with the analysis of German and Turkish discourse. Rambow (1993) was the first to apply the centering methodology to German, aiming at the description of information structure aspects underlying scrambling and topicalization. As a side effect, he used centering to define the utterance's theme and rheme in the sense of the functional sentence perspective (FSP) (Firbas 1974). Viewed from this perspective, the theme/rheme-hierarchy of utterance Ui is determined by the Cf(Ui_l). Elements of Ui that are contained in Cf(Ui-1) are less rhematic than those not contained in Cf(Ui-1). He then concludes that the Cb(Ui) must be the theme of the current utterance. Rambow does not exploit the information structure of utterances to determine the Cf ranking but formulates it on the basis of linear textual precedence among the relevant discourse entities. In order to analyze Turkish texts, Hoffman (1996, 1998) distinguishes between the information structure of utterances and centering, since both constructs are assigned different functions for text understanding. A hearer exploits the information structure of an utterance to update his discourse model, and he applies the centering constraints in order to connect the current utterance to the previous discourse. Hoffman describes the information structure of an utterance in terms of topic (theme) and comment (rheme). The comment is split again into focus and (back)ground (see also Vallduvi \[1990\] and Vallduvf and Engdahl \[1996\]). Based on previous work about Turkish, Hoffman argues that, in this language, the sentence-initial position corresponds to the topic, the position that immediately precedes the verb yields the focus, and the remainder of the sentence is to be considered the (back)ground. Furthermore, Hoffman relates this notion of information structure of utterances to centering, claiming that the topic corresponds to the Cb in most cases--with the exception of segment-initial utterances, which do not have a Cb. Hoffman does not say anything about the relation between information structure and the ranking of the  Computational Linguistics Volume 25, Number 3 Cf list. In her approach, this ranking is achieved by thematic roles (see also Turan \[1998\]).</Paragraph>
      <Paragraph position="8"> Both Rambow (1983) as well as Hoffman (1996, 1998) argue for a correlation between the information structure of utterances and centering. Both of them find a correspondence between the Cb and the theme or the topic of an utterance. They refrain, however, from establishing a strong link between the information structure and centering as we suggest in our model, one that mirrors the influence of information structure in the way the forward-looking centers are actually ranked.</Paragraph>
    </Section>
    <Section position="2" start_page="319" end_page="319" type="sub_section">
      <SectionTitle>
4.2 Functional Centering
</SectionTitle>
      <Paragraph position="0"> Grosz, Joshi, and Weinstein (1995) admit that several factors may have an influence on the ranking of the Cf but limit their exposition to the exploitation of grammatical roles only. We diverge from this proposal and claim that, at least for languages with relatively free word order (such as German), the functional information structure of the utterance is crucial for the ranking of discourse entities in the Cf list. Originally, in Strube and Hahn (1996), we defined the Cf ranking criteria in terms of contextboundedness. In this paper, we redefine the functional Cf ranking criteria by making reference to Prince's work on the assumed familiarity of discourse entities (Prince 1981) and information status (Prince 1992). The term context-bound in Strube and Hahn (1996) corresponds to the term evoked used by Prince. Ideg We briefly list the major claims of our approach to centering. In the following sections, we elaborate on these claims, in particular the ranking of the forward-looking centers.</Paragraph>
      <Paragraph position="1"> * The elements of the Cf list are ordered according to their information status. Hearer-old discourse entities are ranked higher than hearer-new discourse entities. The order of the elements of the Cf list for Ui provides the preference for the interpretation of anaphoric expressions in Ui+l.</Paragraph>
      <Paragraph position="2"> * The first element of the Cf(Ui), the preferred center, Cp(Ui), is the discourse entity the utterance Ui is &amp;quot;about.&amp;quot; In other words, the Cp is the center of attention.</Paragraph>
      <Paragraph position="3"> In contrast to the BFP algorithm, the model of functional centering requires neither a backward-looking center, nor transitions, nor transition ranking criteria for anaphora resolution. For text interpretation, at least, functional centering also makes no commitments to further constraints and rules.</Paragraph>
    </Section>
    <Section position="3" start_page="319" end_page="323" type="sub_section">
      <SectionTitle>
4.3 Cf Ranking Criteria in Functional Centering
</SectionTitle>
      <Paragraph position="0"> In this section, we introduce the functional Cf ranking criteria. We first describe a basic version, which is valid for a wide range of text genres in which pronominal reference is the predominant text phenomenon. This is the type of discourse to which centering was mainly applied in previous approaches (see, for example, Walker's \[1989\] or Di Eugenio's \[1998\] test sets). We then describe the extended version of the functional Cf ranking constraints. The two versions differ with respect to the incorporation of (a subset of) inferables in the second version and, hence, with respect to the requirements 10 In Strube and Hahn (1996), we assumed that the information status of a discourse entity has the main impact on its salience. In particular, evoked discourse entities were ranked higher in the Cf list than brand-new discourse entities (using Prince's terminology). We also restricted the category of the most salient discourse entities to evoked (i.e., context-bound) discourse entities. In this article, we extend this category to hearer-old discourse entities, which includes, besides evoked discourse entities, unused ones (again, referring to Prince's terminology).</Paragraph>
      <Paragraph position="1">  Information status and familiarity (basic version).</Paragraph>
      <Paragraph position="2"> relating to the availability of world knowledge, which is needed to properly account for inferables. The extended version assumes a detailed treatment of a particular sub-set of inferables, so-called functional anaphora (in Hahn, Markert, and Strube \[1996\], functional anaphora are referred to as textual ellipses). We claim that the extended version of ranking constraints is necessary to analyze texts from certain genres, e.g., texts from technical or medical domains. In these areas, pronouns are used rather infrequently, while functional anaphors are the major text phenomena to achieve local coherence.</Paragraph>
      <Paragraph position="3">  on a single set of elements, e.g., grammatical relations (as in Table 1). We use a layered representation for our criteria. For the basic Cf ranking criteria, we distinguish between two different sets of expressions, hearer-old discourse entities in Ui (OLD) and hearer-new discourse entities in Ui (NEW). These sets can be further split into the elements of Prince's (1981, 245) familiarity scale. The set of hearer-old discourse entities (OLD) consists of evoked (E) and unused (U) discourse entities, while the set of hearer-new discourse entities (NEW) consists of brand-new (BN) discourse entities. For the basic Cf ranking criteria, it is sufficient to assign inferable (I), containing inferable (IC), and anchored brand-new (BN A) discourse entities to the set of hearer-new discourse entities (NEW). n See Figure 2 for an illustration of Prince's familiarity scale and its relation to the two sets. Note that the elements of each set are indistinguishable with respect to their information status. Evoked and unused discourse entities, for example, have the same information status because they belong to the set of hearer-old discourse entities. So the basic Cf ranking in Figure 2 boils down to the preference of OLD discourse entities over NEW ones.</Paragraph>
      <Paragraph position="4"> For an operationalization of Prince's terms, we state that evoked discourse entities are simply cospecifying (resolved anaphoric) expressions, i.e., pronominal and nominal anaphora, relative pronouns, previously mentioned proper names, etc. Unused discourse entities are proper names and titles. In texts, brand-new proper names are usually accompanied by a relative clause or an appositive that relates them to the hearer's knowledge. The corresponding discourse entity is evoked only after this elaboration. Whenever these linguistic devices are missing, we treat proper names as unused. 12 In the following, we give some examples of evoked, unused, and brand-new 11 Quoting Prince (1992, 305): &amp;quot;Inferrables are like Hearer-new (and, therefore, Discourse-new) entities in that the hearer is not expected to already have in his/her head the entity in question.&amp;quot; 12 For examples of brand-new proper names and how they are introduced, see, for example, the beginning of articles in the &amp;quot;obituaries&amp;quot; section of the New York Times.</Paragraph>
      <Paragraph position="5">  Computational Linguistics Volume 25, Number 3 discourse entities, though in naturally occurring texts these phenomena rarely show up unadulterated. 13 The remaining categories will be explained subsequently.</Paragraph>
      <Paragraph position="6"> Example 3 a. He lived his final nine years in one of \[two rent-subsidized buildings\]BN constructed especially for elderly survivors.</Paragraph>
      <Paragraph position="7"> b. When the \[buildings\]E opened - one in 1964, one in 1970 - there were waiting lists.</Paragraph>
      <Paragraph position="8"> c. Once, \[they\]E held 333 survivors.</Paragraph>
      <Paragraph position="9"> In example (3a), buildings is introduced as a discourse-new discourse entity, which is brand-new (BN). In (3b), the definite NP the buildings cospecifies the discourse entity from (3a). Hence, buildings in (3b) is evoked (E), just as is they in (3c).</Paragraph>
      <Paragraph position="10"> Certain proper names are assumed to be known by any hearer. Therefore, these proper names need no further explanation. Winnie Madikizela Mandela in example (4) is unused (U), i.e., it is discourse-new but hearer-old. Other proper names have to be introduced because they are discourse-new and hearer-new. In example (5), Marianne Kador is introduced by means of a lengthy appositive that relates the brand-new proper name to the knowledge of the hearer. In particular, the noun phrase the apartment buildings is discourse-old (see example (3)).</Paragraph>
      <Paragraph position="11"> Example 4 \[A defiant Winnie Madikizela Mandela\]u testified for more than 10 hours today, dismissing all evidence that ...</Paragraph>
      <Paragraph position="12"> Example 5 &amp;quot;He was an undervalued person all his life,&amp;quot; said Marianne Kador, a social worker for Selfhelp Community Services, which operates the apartment buildings in Queens.</Paragraph>
      <Paragraph position="13"> In Table 8, we define various sets, which are used for the specification of the Cf ranking criteria in Table 9. We distinguish between two different sets of discourse entities, hearer-old discourse entities (OLD) and hearer-new discourse entities (NEW). For any two discourse entities (x, posx) and (y, posy), with x and y denoting the linguistic surface expression of those entities as they occur in the discourse, and pOSx and posy indicating their respective text position, pOSx ~ posy, in Table 9 we define the basic ordering constraints on elements in the forward-looking centers Cf(Ui). For any utterance Ui, the ordering of discourse entities in the Cf(Ui) that can be derived from the above definitions and the ordering constraints (1) to (3) are denoted by the relation II ..~ H.</Paragraph>
      <Paragraph position="14"> Ordering constraint (1) characterizes the basic relation for the overall ranking of the elements in the Cf. Accordingly, any hearer-old expression in utterance Ui is given the highest preference as a potential antecedent for an anaphoric expression in Ui+l. Any 13 Examples (3) and (5)-(8) are from the New York Times, Dec. 11, 1997. (&amp;quot;Remembering one who remembered. Eugen Zuckermann, survivor, kept the ghosts of the holocaust alive,&amp;quot; by Barry Bearak.) Example (4) is from the New York Times, Dec. 1, 1997. (&amp;quot;Winnie Mandela is defiant, calling accusations 'lunacy',&amp;quot; by Suzanne Daley.) We split complex sentences into the units specified by Kameyama (1998) following the categorization in Figure 1.  Strube and Hahn Functional Centering Table 8 Sets of discourse entities for the basic Cf ranking.</Paragraph>
      <Paragraph position="15"> DE : the set of discourse entities in Ui</Paragraph>
      <Paragraph position="17"> Basic functional ranking constraints on the Cf list.</Paragraph>
      <Paragraph position="18">  1. If x E OLD and y E NEW, then x-~ y. 2. If x, y E OLD or x, y E NEW, then x -~ y, if posx &lt; posy 3. If (1) or (2) do not apply, then x and y are unordered with respect to the Cf-ranking.  hearer-new expression is ranked below hearer-old expressions. Ordering constraint (2) captures the ordering for the sets OLD or NEW when they contain elements of the same type. In this case, the elements of each set are ranked according to their text position.</Paragraph>
      <Paragraph position="19">  with a high proportion of pronouns and nominal anaphora (e.g., literary texts, newspaper articles about persons), it is necessary to refine the ranking criteria in order to deal with expository texts, e.g., test reports, discharge summaries. These texts usually contain few pronouns and are characterized by a large number of inferrables, which are often the major glue in achieving local coherence. In order to accommodate the centering model to texts from these genres, we distinguish a third set of expressions; mediated discourse entities in Ui (MED). On Prince's (1981) familiarity scale, the set of hearer-old discourse entities (OLD) remains the same as before, i.e., it consists of evoked (E) and unused (U) discourse entities, while the set of hearer-new discourse entities (NEW) now consists only of brand-new (BN) discourse entities. Inferable (I), containing inferable (IC), and anchored brand-new (BN A) discourse entities, which make up the set of mediated discourse entities, have a status between hearer-old and hearer-new discourse entities. 14 See Figure 3 for Prince's familiarity scale and its relation to the three sets. Again, the elements of this set are indistinguishable with respect to their information status--for instance, inferable and anchored brand-new discourse entities have the same information status because they belong to the set of mediated discourse entities. Hence, the extended Cf ranking, depicted in Figure 3, will prefer OLD discourse entities over MEDiated ones, and MEDiated ones will be preferred over NEW ones.</Paragraph>
      <Paragraph position="20"> We assume that the difference between containing inferables and anchored brand-new discourse entities is negligible. (It was not well defined in Prince \[1981\] and in 14 Again, quoting Prince (1992, 305-306): &amp;quot;Inferrables are thus like Hearer-old entities in that they rely on certain assumptions about what the hearer does know, e.g. that buildings typically have doors \[...\], and they are like Discourse-old entities in that they rely on there being already in the discourse-model some entity to trigger the inference \[...\].&amp;quot;</Paragraph>
    </Section>
    <Section position="4" start_page="323" end_page="323" type="sub_section">
      <SectionTitle>
Computational Linguistics
</SectionTitle>
      <Paragraph position="0"> Figure 3 Information status and familiarity (refined version). Volume 25, Number 3 Prince \[1992\] she abandoned the second term.) Therefore, we conflate them into the category of anchored brand-new discourse entities. These discourse entities require that the anchor modifies a brand-new head and that the anchor is either an evoked or an unused discourse entity. In the following, we give examples of inferrables and anchored brand-new discourse entities.</Paragraph>
      <Paragraph position="1"> Example 6 a. By his teen-age years, the distorted mentality of anti-Semitism was in full warp.</Paragraph>
      <Paragraph position="2"> b. \[Thefamily\]i was expelled to Hungary in 1939 ...</Paragraph>
      <Paragraph position="3"> In example 6 the relation between the definite NP the family and the context has to be inferred, therefore the family belongs to the category inferable (I). It is marked by definiteness but it is not anaphoric since there is no anaphoric antecedent. Though inferables are often marked by definiteness, it is possible that they are indefinite, like an uncle in example (7b).</Paragraph>
    </Section>
    <Section position="5" start_page="323" end_page="324" type="sub_section">
      <SectionTitle>
Example 7
</SectionTitle>
      <Paragraph position="0"> a. He shared this bounty with his father b. but \[a sickly uncle\]1 was left to remain hungry. Anchored brand-new (BN A) discourse entities as in example (8) are heads of phrases whose modifiers relate (anchor) them to the context.</Paragraph>
      <Paragraph position="1"> Example 8 a. He had already lost too many companions.</Paragraph>
      <Paragraph position="2"> b. \[\[HiS\]E fianc~e\]BNA had died in a car wreck.</Paragraph>
      <Paragraph position="3"> With respect to inferables, there exist only a few computational treatments, all of which are limited in scope. We here restrict inferables to the particular subset defined by Hahn, Markert, and Strube (1996), which we call functional anaphora (FA). In the following, we will limit our discussion of inferables to those which figure as functional anaphors. In Table 10, we define the sets needed for the specification of the extended Cf ranking criteria in Table 11. We distinguish between three different sets of discourse entities; hearer-old discourse entities (OLD), mediated discourse entities (MED), and hearer-new discourse entities (NEW). Note that the antecedent of a functional anaphor (the inferred discourse entity) is included in the set of hearer-old discourse entities.</Paragraph>
      <Paragraph position="4">  the set of evoked discourse entities in Ui the set of unused discourse entities in Ui the set of antecedents of functional anaphors in Ui the set of functional anaphors in Ui the set of anchored brand-new discourse entities in Ui</Paragraph>
      <Paragraph position="6"> 2. If x, y C OLD, or x, y C MED, or x, y C NEW, then x -&lt; y, if pOSx &lt; posy 3. If (1) or (2) do not apply, then x and y are unordered with respect to the Cf-ranking.</Paragraph>
      <Paragraph position="7">  For any two discourse entities (x, pOSx) and (y, po@), with x and y denoting the linguistic surface expression of those entities as they occur in the discourse, and pOSx and posy indicating their respective text position, pOSx =fi posy, in Table 11 we define the extended functional ordering constraints on elements in the forward-looking centers Cf(Ui). In the following, for any utterance Ui, the ordering of discourse entities in the Cf(Ui) that can be derived from the above definitions and the ordering constraints (1) to (3) are denoted by the relation &amp;quot;-&lt;&amp;quot;.</Paragraph>
      <Paragraph position="8"> Ordering constraint (1) characterizes the basic relation for the overall ranking of the elements in the Cf. Accordingly, any hearer-old expression in utterance Ui is given the highest preference as a potential antecedent for an anaphoric or functional anaphoric expression in Ui+l. Any mediated expression is ranked just below hearer-old expressions. Any hearer-new expression is ranked lowest. Ordering constraint (2) fixes the ordering when the sets OLD, MED, or NEW contain elements of the same type. In these cases, the elements of each set are ranked according to their text position. In Table 12 we show the analysis of text fragment (2) using the basic algorithm see Table 3) with the basic functional Cf ranking constraints (see Table 9). The fragment starts with the evoked discourse entity SENTRY in (2a) (the definiteness of the NP indicates that it was already mentioned earlier in the text). The pronouns he in (2b) and (2c) are evoked, while signs and tunic are brand-new. We assume Mike in (2d) to be evoked, too (MIKE is the main character of that story). MIKE is the leftmost evoked discourse entity in (2d), hence ranked highest in the Cf(2d) and the most preferred antecedent for the pronoun he in (2e).</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="324" end_page="335" type="metho">
    <SectionTitle>
5. Evaluation
</SectionTitle>
    <Paragraph position="0"> In this section, we discuss two evaluation experiments on naturally occurring data.</Paragraph>
    <Paragraph position="1"> We first compare the success rate of the functional centering algorithm with that of the BFP algorithm. This evaluation uses the basic Cf ranking constraints from Table 9.</Paragraph>
    <Paragraph position="2">  Cf: \[MIKEE: Mike, TUNICE: this, it, SENTRYE: him\] He tied and gagged the man,...</Paragraph>
    <Paragraph position="3"> Cb: MIKEE: he Cf: \[MIKEE: he, SENTRY:E the man\] We then introduce a new cost-based evaluation method, which we use for comparing the extended Cf ranking constraints from Table 11 with several other approaches.</Paragraph>
    <Section position="1" start_page="325" end_page="329" type="sub_section">
      <SectionTitle>
5.1 Success Rate Evaluation
</SectionTitle>
      <Paragraph position="0"> gorithm from Table 3 operating with the basic functional Cf ranking constraints from Table 9) with the BFP algorithm, we analyzed a sample of English and German texts.</Paragraph>
      <Paragraph position="1"> The test set (Table 13) consisted of the beginnings of three short stories by Ernest Hemingway, 15 three articles from the New York Times (NYT), 16 the first three chapters of a novel by Uwe JohnsonS the first two chapters of a short story by Heiner Mfiller, TM and seven articles from the Frankfurter Allgemeine Zeitung (FAZ). 19 15 Hemingway, Ernest. 1987. The Complete Short Stories of Ernest Hemingway. Scribner, New York. (&amp;quot;An African story,&amp;quot; pages 545-554; &amp;quot;Soldier's home,&amp;quot; pages 111-116; &amp;quot;Up in Michigan,&amp;quot; pages 59~62.) 16 (i) New York Times, Dec. 7, 1997. (&amp;quot;Shot in head, suspect goes free, then to college,&amp;quot; by Jane Fritsch, pages A45-48.) (ii) New York Times, Dec. 1, 1997. (&amp;quot;Winnie Mandela is defiant, calling accusations 'lunacy',&amp;quot; by Suzanne Daley, pages A1-12.) (iii) New York Times, Dec. 11, 1997. (&amp;quot;Remembering one who remembered. Eugen Zuckermann, survivor, kept the ghosts of the holocaust alive,&amp;quot; by Barry Bearak, pages B1-8.) 17 Johnson, Uwe. 1965. Zwei Ansichten. Suhrkamp Verlag, Frankfurt am Main.</Paragraph>
      <Paragraph position="2"> 18 Miiller, Heiner. 1974. Geschichten aus der Produktion 2. Rotbuch Verlag, Berlin. (&amp;quot;Liebesgeschichte,&amp;quot; pages 57-62.) 19 FAZ, Aug. 28, 1997. (&amp;quot;Die gute Nachricht ist: Wir k6nnen gewirmen. New Yorks frthherer Polizeiprasident in Berlin,&amp;quot; by Konrad Schuller.) (ii) FAZ, Nov. 3, 1997. (&amp;quot;Biirgermeister Giuliani steht vor einer fast sicheren Wiederwahl,&amp;quot; by Verena Leucken.) (iii) FAZ, Sept. 9, 1997. (&amp;quot;Wir haben viel voneinander lernen kiSnnen,&amp;quot; by Claus Peter Mfiller.) (iv) FAZ, Sept. 10, 1997. (&amp;quot;Die Mutter der Meinungsforschung im Streit. Ist Elisabeth Noelle-Neumann eine unverbesserliche Deutsche?&amp;quot; by Kurt Reumann.) (v) FAZ, Aug. 4, 1997. (&amp;quot;Der zarte Riese, Geisterhaftes Klanglicht und ein Zug ins Weite: Zum Tode von Swjatoslaw Richter,&amp;quot; by Gerhard R. Koch.) (vi) FAZ, Sept. 2, 1997. (&amp;quot;Glaubwtirdiger als der K6ixigssohn. Der Oppositionspolitiker Sam Rainsy k/trnpft ffir das bessere Kambodscha,&amp;quot; by Erhard Haubold.) (vii) FAZ, Sept. 3, 1997. (&amp;quot;Bald das Ende des Vorsitzenden Wagner? Wechsel an der Spitze der CDU-Fraktion,&amp;quot; by Peter Jochen Winters.)  by a small-scale discourse annotation tool. We used the following guidelines for our evaluation: We did not assume any world knowledge as part of the anaphora resolution process. Only agreement criteria and sortal constraints were applied. We did not account for false positives and error chains, but marked the latter (see Walker 1989).</Paragraph>
      <Paragraph position="3"> We use Kameyama's (1998) specifications for dealing with complex sentences (for a description, see Section 3). Following Walker (1989), a discourse segment is defined as a paragraph unless its first sentence has a pronoun in subject position or a pronoun whose syntactic features do not match the syntactic features of any of the preceding sentence-internal noun phrases. Also, at the beginning of a segment, anaphora resolution is preferentially performed within the same utterance. According to the preference for intersentential candidates in the original centering model, we defined the following anaphora resolution strategy (which is not the best solution for the anaphora resolution problem either, but sufficient for the purposes of the evaluation):</Paragraph>
      <Paragraph position="5"> Test elements of Cf(Ui_l)--according to the BFP algorithm, or the functional centering (henceforth abbreviated as FunC) algorithm.</Paragraph>
      <Paragraph position="6"> Test elements of Ui, which precede the pronoun, left-to-right.</Paragraph>
      <Paragraph position="7"> Test elements of Cf(Ui_2) , Cf(Ui_3) .... in the given order.</Paragraph>
      <Paragraph position="8"> Since clauses are short in general, step 2 of the algorithm only rarely applies. 5.1.3 Results. The results of our evaluation are given in Table 14. The first row gives the number of third person pronouns and possessive pronouns in the data. The upper part of the table shows the results for the BFP algorithm, the lower part those for the FunC algorithm. Overall, the data are consistently in favor of the FuncC algorithm, though no significance judgments can be made (the data were not drawn as a random sample). The overall error rate of each approach is given in the rows labeled as &amp;quot;wrong&amp;quot;. We also tried to determine the major sources of errors (see the nonbold sections in Table 14), and were able to distinguish three different types. One class of errors relates to the algorithm's strategy. In the case of the BFP algorithm, the corresponding row also contains the number of ambiguous cases generated by this algorithm (we counted ambiguities as errors, since FunC produced only one reading in these cases). A second class of errors results from error chains, mainly caused by the strategy of each approach or by ambiguities in the BFP algorithm. A third error class is caused by the intersentential specifications, e.g., the correct antecedent is not accessible because it is realized in an embedded clause (reported speech). Finally, other errors were mainly caused by split antecedents (plural pronouns referring to a couple of antecedents in singular), reference to events (or propositions), and cataphora.</Paragraph>
      <Paragraph position="9">  sentences and by other reasons is almost identical (the small difference can be explained by false positives), there is a remarkable difference between the algorithms with respect to strategic errors and error chains. Strategic errors occur whenever the preference given by the algorithm under consideration leads to an error. Most of the strategic errors implied by the FunC algorithm also show up as errors for the BFP algorithm. We interpret this finding as an indication that these errors are caused by a lack of semantic or world knowledge. The remaining errors of the BFP algorithm are caused by the strictly local definition of its criteria and because the BFP algorithm cannot deal with some particular configurations leading to ambiguities. The FunC algorithm has fewer error chains not only because it yields fewer strategic errors, but also because it is more robust with respect to real texts. An utterance Ui, for instance, which intervenes between Ui-1 and Ui+l without any relation to Ui-1 does not affect the preference decisions in Ui+2 for FunC, although it does affect them for the BFP algorithm, since the latter cannot assign the Cb(Ui+l). Also, error chains are sometimes shorter in the FunC analyses.</Paragraph>
      <Paragraph position="10"> Example (9) illustrates how the local restrictions as defined by the original centering model and the BFP algorithm result in errors and lead to rather lengthy error chains (see Table 15 for the corresponding centering analysis). The discourse entity SENTENCE, which is cospecified by the pronoun er, 'it'masc, in (9b), is the Cb(9b). Therefore, it is the most preferred antecedent for the pronoun ihn in (9c), which causes a strategic error. This error, in turn, is the reason for a consequent error in (9d), because there are no semantic cues that enforce the correct interpretation, i.e., the coreferentiality between ihn and Giuliani. The possible interruption of the error chain, indicated by the alternative interpretation in (9c), is ruled out, however, by the preference for  BFP results for example (9).</Paragraph>
      <Paragraph position="11"> (9a) Cb: -Cf: \[SENTENCE: Satz, dem, der, der, RUTH: Ruth Messinger, ihr, DEBATES: Fernsehdebatten, RACE: Biirgermeisterwahlkampf, NEW YORK: New York, RECOLLECTION: Erinnerung\] (9b) Cb: SENTENCE: er CONTINUE Cf: \[SENTENCE: er, VICTORY: Wahlsieg, GIULIANI: Rudolph Giuliani\] (9c) Cb: SENTENCE: ihn RETAIN Cf: \[NEWSPAPERS: Zeitungen, SENTENCE: ihn, NEW YORK: Stadt\] Cb ............... : ~ ut~. ~,~,~ ~OUGH-SH/FT Cf r~T ........... -7_:~ ........ ,-~ ......... :1.._ ~,T .... ~r .... O,- JL1 (9d) Cb: RETAIN Cf: SENTENCE: ihm \[UNIONS: Gewerkschaften, SENTENCE: ihm\] \[The sentence\]smuabSjCect, with which Ruth Messinger - one of the TV debates - opened, - will - the only one - be, - which - of her - in memory - null remains.</Paragraph>
      <Paragraph position="12"> The sentence, with which Ruth Messinger opened one of the TV debates, will be the only one, which will be recollected of her.</Paragraph>
      <Paragraph position="13"> b. Am nahezu sicheren Wahlsieg des Amtsinhabers Rudolph Giuliani am Dienstag wird er nichts ~indern.</Paragraph>
      <Paragraph position="14"> \[Of the almost certain - victory in the election - of \[the officeholder Rudolph Giuliani\]masc\]a'd~SCct - on Tuesday - will - \[it\]smu~Cct - nothing alter. null Of the officeholder Rudolph Giuliani's almost certain victory in the election on Tuesday, it will alter nothing.</Paragraph>
      <Paragraph position="15"> c. Alle Zeitungen der Stadt unterstiitzen ihn.</Paragraph>
      <Paragraph position="16"> \[All - newspapers of the city\]subject - support - \[him\]dmiraeScCt_object . He is supported by all newspapers of the city.</Paragraph>
      <Paragraph position="17"> d. Die Gewerkschaften stehen hinter ihm.</Paragraph>
      <Paragraph position="18"> \[The unions\]subject - stand behind - \[him\]imnadSiCrect_object .</Paragraph>
      <Paragraph position="19"> He is backed up by the unions.</Paragraph>
      <Paragraph position="20"> The nonlocal definition of hearer-old discourse entities enables the FunC algorithm to compute the correct antecedent for the pronoun ihn in (9c) preventing it from running into an error chain (see Table 16 for the functional centering data). GIULIANI, who was mentioned earlier in the text, is the leftmost evoked discourse entity in (9b) and therefore the most preferred antecedent for the pronoun in (9c), though there is a pronoun of the same gender in (9b).</Paragraph>
      <Paragraph position="21"> We encountered problems with Kameyama's (1998) specifications for complex sentences. The differences between clauses that are accessible from a higher syntactic level and clauses that are not could not be verified by our analyses. Also, her approach is sometimes too coarse-grained (i.e., there are still antecedents within one utterance), and sometimes too fine-grained. 2deg 20 An alternative to Kameyama's intrasentential centering, which overcomes these problems and leads to</Paragraph>
    </Section>
    <Section position="2" start_page="329" end_page="331" type="sub_section">
      <SectionTitle>
5.2 Cost-based Evaluation
</SectionTitle>
      <Paragraph position="0"> ferent text genres: 15 product reviews from the information technology (IT) domain, one article from the German news magazine Der Spiegel, and the first two chapters of a short story by the German writer Heiner Mtillerf I Table 17 summarizes the total number of (pro)nominal anaphors, functional anaphors, utterances and words in the test set.</Paragraph>
      <Paragraph position="1">  pared three approaches to the ranking of the Cf: a model whose ordering principles are based on grammatical role indicators only (see Table 1); an &amp;quot;intermediate&amp;quot; model, which can be considered a &amp;quot;naive&amp;quot; approach to free-word-order languages; and the functional model based on the information structure constraints stated in Table 11. For reasons discussed below, slightly modified versions of the naive and the grammatical approaches will also be considered. They are characterized by the additional constraint that antecedents of functional anaphors are ranked higher than the functional anaphors themselves. As in Section 5.1, the evaluation was carried out manually by the authors. Since most of the anaphors in these texts are nominal anaphors, the resolution of which is much more restricted than that of pronominal anaphors, the success rate for the whole anaphora resolution process is not distinctive enough for a proper evaluation of the functional constraints. The reason for this lies in the fact that nominal anaphors are far more constrained by conceptual criteria than pronominal ones. Thus, the chance of properly resolving a nominal anaphor, even when ranked at a lower position in the center lists, is greater than for pronominal anaphors. By shifting our evaluation criteria away from resolution success data to structural conditions reflecting the proper ordering of center lists (in particular, we focus on the most highly ranked item of the forward-looking centers), these criteria are intended to compensate for the a significant improvement in the results, is proposed in Strube (1998).</Paragraph>
      <Paragraph position="2"> 21 Mtiller, Heiner. 1974. Geschichten aus der Produktion 2. Rotbuch Verlag, Berlin. (&amp;quot;Liebesgeschichte,&amp;quot; pages 57~2.)  of centering transitions between the utterances in the three test sets. The first column contains those generated by the naive approach (such a proposal was made by Gordon, Grosz, and Gilliom \[1993\] as well as by Rambow \[1993\], who, nevertheless, restricts it to the German middlefield). We simply ranked the elements of C/according to their text position. While it is usually assumed that the functional anaphor (FA) is ranked above its antecedent (FA ante) (Grosz, Joshi, and Weinstein 1995, 217), we assume the opposite. The second column contains the results of this modification with respect to the naive approach. In the third column of Table 18, we give the numbers of transitions generated by the grammatical constraints (Table 1) stated by Grosz, Joshi, and Weinstein (1995, 214, 217). The fourth column supplies the results of the same modification as was used for the naive approach, namely, antecedents of functional anaphors are ranked higher than the corresponding anaphoric expressions. The fifth column shows the results generated by the functional constraints from Table 11.</Paragraph>
      <Paragraph position="3">  a preference order among transition types--CONTINUE ranks above RETAIN and RETAIN ranks above SHIFT. This preference order reflects the presumed inference load put on the hearer to coherently decode a discourse. Since the functional approach generates more CONTINUE transitions (see Table 18), we interpret this as preliminary evidence that this approach provides for a more efficient processing than its competitors. In particular, the observation of a predominance of CONTINUEs holds irrespective of the various text genres we considered for functional centering and, to a lesser degree, for the modified grammatical ranking constraints.</Paragraph>
    </Section>
    <Section position="3" start_page="331" end_page="331" type="sub_section">
      <SectionTitle>
Computational Linguistics Volume 25, Number 3
5.2.5 Method (Costs of Transition Types). The arguments we have given so far do
</SectionTitle>
      <Paragraph position="0"> not seem to be entirely convincing. Counting single occurrences of transition types, in general, does not reveal the entire validity of the center lists. Considering adjacent transition pairs as an indicator of validity should give a more reliable picture, since depending on the text genre considered (e.g., technical vs. news magazine vs.</Paragraph>
      <Paragraph position="1"> literary texts), certain sequences of transition types may be entirely plausible though they include transitions which, when viewed in isolation, seem to imply considerable inferencing load (Table 18). For instance, a CONTINUE transition that follows a CONTINUE transition is a sequence that requires the lowest processing costs. But a CONTINUE transition that follows a RETAIN transition implies higher processing costs than a SMOOTH-SHIFT transition following a RETAIN transition. This is due to the fact that a RETAIN transition ideally predicts a SMOOTH-SHIFT in the following utterance.</Paragraph>
      <Paragraph position="2"> Hence, we claim that no one particular centering transition should be preferred over another. Instead, we advocate the idea that certain centering transition pairs are to be preferred over others. Following this line of argumentation, we propose here to classify all occurrences of centering transition pairs with respect to the &amp;quot;costs&amp;quot; they imply. The cost-based evaluation of different Cf orderings refers to evaluation criteria that form an intrinsic part of the centering model.</Paragraph>
      <Paragraph position="3"> Transition pairs hold for three immediately successive utterances. We distinguish between two types of transition pairs, cheap ones and expensive ones.</Paragraph>
      <Paragraph position="4"> * A transition pair is cheap if the backward-looking center of the current utterance is correctly predicted by the preferred center of the immediately preceding utterance, i.e., Cb(Ui) = Cp(Ui_l).</Paragraph>
      <Paragraph position="5"> * A transition pair is expensive if the backward-looking center of the current utterance is not correctly predicted by the preferred center of the immediately preceding utterance, i.e., Cb(Ui) # G(Ui_I).</Paragraph>
      <Paragraph position="6"> In particular, chains of the RETAIN transition in passages where the Cb does not change (passages with constant theme) show that the grammatical ordering constraints for the forward-looking centers are not appropriate.</Paragraph>
      <Paragraph position="7">  generated by the different approaches are shown in Table 19. In general, the functional approach reveals the best results, while the naive and the grammatical approaches work reasonably well for the literary text, but exhibit a remarkably poorer performance for the texts from the IT domain and, to a lesser degree, from the news magazine. The results for the latter approaches improve only slightly with the modification of ranking the antecedent of an functional anaphor (FA ante) above the functional anaphor itself (FA). In any case, they do not compare to the results of the functional approach.</Paragraph>
    </Section>
    <Section position="4" start_page="331" end_page="332" type="sub_section">
      <SectionTitle>
5.3 Extension of the Centering Transitions
</SectionTitle>
      <Paragraph position="0"> Our use of the centering transitions led us to the conclusion that CONTINUE and SMOOTH-SHIFT are not completely specified by Grosz, Joshi, and Weinstein (1995) and Brennan, Friedman, and Pollard (1987). According to Brennan, Friedman, and Pollard's definition, it is possible that a transition is labeled SMOOTH-SHIFT even if Cp(Ui) Cp(Ui-1). Such a SHIFT is less smooth, because it contradicts the intuition that a SMOOTH-SHIFT fulfills what a RETAIN predicted. The same applies to a CONTINUE with this characteristic. Hence, we propose to extend the set of transitions as shown in Ta-</Paragraph>
      <Paragraph position="2"> Costs for transition pairs.</Paragraph>
      <Paragraph position="3"> CONT. EXP-CONT. RET. SMOOTH-S. EXP-SMOOTH-S. ROUGH-S. - cheap - exp. - - -CONT. cheap - cheap exp. - exp.</Paragraph>
      <Paragraph position="4"> EXP-CONT. exp. - exp. exp. - exp.</Paragraph>
      <Paragraph position="5"> RET. exp. exp. exp. cheap exp. exp.</Paragraph>
      <Paragraph position="6"> SMOOTH-S. cheap exp. exp. exp. exp. exp. EXP-SMOOTH-S. exp. exp. exp. exp. exp. exp. ROUGH-S. exp. exp. exp. cheap exp. exp. ble 20. The definitions of CONTINUE and SMOOTH-SHIFT are extended by the condition that Cp(Ui) = Cp(Ui-1), while EXP-CONTINUE and EXP-SMOOTH-SHIFT (expensive CONTINUE and expensive SMOOTH-SHIFT) require the opposite. RETAIN and ROUGH-SHIFT fulfill Cp(Ui) =fi Cp(Ui-1) without further extensions. Table 21 contains a complete overview of the transition pairs. Only those whose second transition fulfills the criterion Cp(Ui) = Cp(Ui-1) are labeled as &amp;quot;cheap.&amp;quot;</Paragraph>
    </Section>
    <Section position="5" start_page="332" end_page="333" type="sub_section">
      <SectionTitle>
5.4 Redefinition of Rule 2
</SectionTitle>
      <Paragraph position="0"> Grosz, Joshi, and Weinstein (1995) define Rule 2 of the centering model on the basis of sequences of transitions. Sequences of CONTINUE transitions are preferred over  Computational Linguistics Volume 25, Number 3 sequences of RETAIN transitions, which are preferred over sequences of SHIFT transitions. Brennan, Friedman, and Pollard (1987) utilize this rule for anaphora resolution but restrict it to single transitions. Based on the preceding discussion of cheap and expensive transition pairs, we propose to redefine Rule 2 in terms of the costs of transition types. 22 Rule 2 then reads as follows: Rule 2&amp;quot; Cheap transition pairs are preferred over expensive ones.</Paragraph>
      <Paragraph position="1"> We believe that this definition of Rule 2 allows for a far better assessment of referential coherence in discourse than a definition in terms of sequences of transitions. For anaphora resolution, we interpret Rule 2&amp;quot; such that the preference for antecedents of anaphors in Ui can be derived directly from the Cf(Ui-1). The higher a discourse entity is ranked in the Cf, the more likely it is the antecedent of a pronoun. We see the redefinition of Rule 2 as the theoretical basis for a centering algorithm for pronoun resolution that simply uses the Cf as a preference ranking device like the basic centering algorithm shown in Table 3. In this algorithm, the metaphor of costs translates into the number of elements of the Cf that have to be tested until the correct antecedent is found. If the Cp of the previous utterance is the correct one, then the costs are indeed very low.</Paragraph>
    </Section>
    <Section position="6" start_page="333" end_page="335" type="sub_section">
      <SectionTitle>
5.5 Does Functional Centering Provide a More Satisfactory Explanation of the Data?
</SectionTitle>
      <Paragraph position="0"> We were also interested in finding out whether the functional criteria we propose might explain the linguistic data in a more satisfactory way than the grammaticalrole-based criteria discussed so far. So, we screened sample data from the literature, which were already annotated by centering analyses (for English, we considered all examples discussed in Grosz, Joshi, and Weinstein \[1995\] and Brennan, Friedman, and Pollard \[1987\]). We achieved consistent results for the grammatical and the functional approach for all the examples contained in Grosz, Joshi, and Weinstein (1995) but found diverging analyses for some examples discussed by Brennan, Friedman, and Pollard (1987). While the RETAIN-SHIFT combination in examples (10c) and (10d') (slightly modified from Brennan, Friedman, and Pollard \[1987, 157\]) did not indicate a difference between the approaches, for the RETAIN-CONTINUE combination in examples (10c) and (10d), the two approaches led to different results (see Table 22 for the BFP algorithm and Table 23 for the FunC algorithm).</Paragraph>
      <Paragraph position="1"> Example 10 a. Brennan drives an Alfa Romeo.</Paragraph>
      <Paragraph position="2"> b. She drives too fast.</Paragraph>
      <Paragraph position="3"> c. Friedman races her on weekends.</Paragraph>
      <Paragraph position="4"> d. She often wins.</Paragraph>
      <Paragraph position="5"> d'. She often beats her.</Paragraph>
      <Paragraph position="6"> Within the functional approach, the proper name Friedman is unused and, therefore, the leftmost hearer-old discourse entity of (10c). Hence, FRIEDMAN is the most preferred antecedent for the pronoun she in (10d) and (10d').</Paragraph>
      <Paragraph position="7"> 22 See Di Eugenio (1998) for a discussion regarding certain pairs of transitions and their relation to zero vs. strong pronouns.</Paragraph>
      <Paragraph position="8">  BFP interpretation for example (10)--The &amp;quot;Friedman&amp;quot; scenario. (10a) Cb: -Cf: \[BRENNAN: Brennan, ALFA ROMEO: Alfa Romeo\] (10b) Cb: \[BRENNAN: she\] CONTINUE Cf: \[BRENNAN: she\] (10c) Cb: \[BRENNAN: her\] RETAIN Cf: \[FRIEDMAN: Friedman, BRENNAN: her\] (10d) Cb: \[BRENNAN: she\] CONTINUE Cf: \[BRENNAN: she\] Cb: \[FRIEDMAN: slie\] SMOOTH-SHIFT Cf: \[FaiE~l&amp;quot;~ia~. she\] (10d') Cb: \[FRIEDMAN: she\] SMOOTH-SHIFT Cf: \[FRIEDMAN: she, BRENNAN: her\] Cb: \[FaIE~ZvlaN: her\]  \[BRENNANu: Brennan, ALFA ROMEOBN: Alfa Romeo\] \[BRENNANE: she\] \[FRIEDMANu: Friedman, BRENNANE: her\] \[FRIEDMANE : she\] \[FRIEDMANE: she, BRENNANE: her\] But is subjecthood really the decisive factor? When we replace Friedman with a hearer-new discourse entity, e.g., a professional driver, as in (10c#), 23 then the procedures generate inconsistent results, again. In the BFP algorithm, the ranking of the Cf list depends only on grammatical roles. Hence, DRIVER is ranked higher than BRENNAN in the Cf(lOc'). In (10d), the pronoun she is resolved to BRENNAN because of the preference for CONTINUE over SMOOTH-SHIFT. In (10d~), she is resolved to DRIVER because SMOOTH-SHIFT is preferred over ROUGH-SHIFT (see Table 24).</Paragraph>
      <Paragraph position="9"> 10cq A professional driver races her on weekends.</Paragraph>
      <Paragraph position="10"> Within the functional approach, the evoked phrase her in (10c ~) is ranked higher than the brand-new phrase a professional driver. Therefore, the preference changes between example (10c) and (10c'). In (10d) and (10d') the pronoun she is resolved to BRENNAN, the discourse entity denoted by her (see Table 25).</Paragraph>
      <Paragraph position="11"> We find the analyses of functional centering to match our intuitions about the underlying referential relations more closely than those that are computed by grammatically based centering approaches. Hence, in the light of this still preliminary evidence, we answer the question we posed at the beginning of this subsection in the affirmative--functional centering indeed explains the data in a more satisfying manner than other well-known centering principles.</Paragraph>
      <Paragraph position="12"> 23 We owe this variant to Andrew Kehler. This example may misdirect readers because the phrase a professional driver might be assigned the &amp;quot;default&amp;quot; gender masculine. Anyway, this example--like the original example--seems not to be felicitous English and has only illustrative character.  BFP interpretation for Example (10)--The &amp;quot;driver&amp;quot; scenario. (10a) Cb: -Cf: \[BRENNAN: Brennan, ALFA ROMEO: Alfa Romeo\] (10b) Cb: \[BRENNAN: she\] Cf: \[BRENNAN: she\] CONTINUE (10c') CD: \[BRENNAN: her\] Cf: \[DRIVER: driver, BRENNAN: her\] RETAIN (10d) Cb: \[BRENNAN: she\] Cf: \[BRENNAN: she\] CONTINUE Cb: \[D F~7C/F,R. d~e\] Cf: \[D~ivF~a. she\] ........ - .....</Paragraph>
      <Paragraph position="13"> (lOd') Cb: \[DRIVER: she\] SMOOTH-SHIFT Cf: \[DRIVER: she, BRENNAN: her\] Cb: \[DKIVEF~. \]igr\] R,OUGH-SHIFT r~ ......... Cf: \[ ........... ~,,~,- .......... ,-,,~, v ~. he;\]  \[BRENNANu: Brennan, ALFA ROMEOBN: Alfa Romeo\] \[BRENNANE: she\] \[BRENNANE: her, DRIVERBN: driver\] \[BRENNANE: she\] \[BRENNANE: she, DRIVERE: her\]</Paragraph>
    </Section>
    <Section position="7" start_page="335" end_page="335" type="sub_section">
      <SectionTitle>
5.6 Summary of Evaluation
</SectionTitle>
      <Paragraph position="0"> To summarize the results of our empirical evaluation, we claim, first, that our proposal based on functional criteria leads to substantially improved and--with respect to the inference load placed on the text understander, whether human or machine--more plausible results for languages with free word order than the structural constraints given by Grosz, Joshi, and Weinstein (1995) and those underlying the naive approach.</Paragraph>
      <Paragraph position="1"> We base these observations on an evaluation study that considers transition pairs in terms of the inference load specific pairs imply. Second, we have gathered preliminary evidence, still far from conclusive, that the functional constraints on centering seem to explain linguistic data more satisfactorily than the common grammar-oriented constraints. Hence, we hypothesize that these functional constraints might constitute a general framework for treating free- and fixed-word-order languages by the same methodology. This claim, without doubt, has to be further substantiated by additional cross-linguistic empirical studies.</Paragraph>
      <Paragraph position="2"> The cost-based evaluation we focused on in this section refers to evaluation criteria that form an intrinsic part of the centering model. As a consequence, we have redefined Rule 2 of the Centering Constraints (Grosz, Joshi, and Weinstein 1995, 215) appropriately. We replaced the characterization of a preference for sequences of CONTINUE over sequences of RETAIN and, similarly, sequences of RETAIN over sequences of SHIFT by one in which cheap transitions are to be preferred over expensive ones.</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="335" end_page="337" type="metho">
    <SectionTitle>
6. Comparison with Related Approaches
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="335" end_page="336" type="sub_section">
      <SectionTitle>
6.1 Focus-based Approaches
</SectionTitle>
      <Paragraph position="0"> Approaches to anaphora resolution based on focus devices partly use the information status of discourse entities to determine the current discourse focus. However, a  Strube and Hahn Functional Centering common area of criticism of these approaches is the diversity of data structures they require. These data structures are likely to hide the underlying linguistic regularities, because they promote the mix of preference and data structure considerations in the focusing algorithms. As an example, Sidner (1983, 292ff.) distinguishes between an Actor Focus and a Discourse Focus, as well as corresponding lists, viz. Potential Actor Focus List and Potential Discourse Focus List. Suri and McCoy (1994) in their RAFT/RAPR approach use grammatical roles for ordering the focus lists and make a distinction between Sub-ject Focus, Current Focus, and corresponding lists. Both focusing algorithms prefer an element that represents the Focus to the elements in the list when the anaphoric expression under consideration is not the agent (for Sidner) or the subject (for Suri and McCoy). Relating these approaches to our proposal, they already exhibit a weak preference for a single hearer-old (more precisely, evoked) discourse element. Dahl and Ball (1990), describing the anaphora resolution module of the PUNDIT system, improve the focusing mechanism by simplifying its underlying data structures. Thus, their proposal is more closely related to the centering model than any other focusing mechanism. Furthermore, if there is a pronoun in the sentence for which the Focus List is built, the corresponding evoked discourse entity is shifted to the front of the list. The following elements of the Focus List are ordered by grammatical roles again. Hence, their approach still relies upon grammatical information for the ordering of the centering list, while we use only the functional information structure as the guiding principle.</Paragraph>
    </Section>
    <Section position="2" start_page="336" end_page="337" type="sub_section">
      <SectionTitle>
6.2 Heuristics
</SectionTitle>
      <Paragraph position="0"> Given its embedding in a cognitive theory of inference loads imposed on the hearer and, even more importantly, its fundamental role in a more comprehensive theory of discourse understanding based on linguistic, attentional, and intentional layers, the centering model can be considered the first principled attempt to deal with preference orders for plausible antecedent selection for anaphors. Its predecessors were entirely heuristic approaches to anaphora resolution. These were concerned with various criteria--beyond strictly grammatical constraints such as agreement--for the optimization of the referent selection process based on preferential choices. An elaborate description of several of these preference criteria is supplied by Carbonell and Brown (1988) who discuss, among others, heuristics involving case role filling, semantic and pragmatic alignment, syntactic parallelism, syntactic topicalization, and intersentential recency. Given such a wealth of criteria one may either try to order them a priori in terms of importance or--as was proposed by the majority of researchers in this field-define several scoring functions that compute flexible orderings on the fly. These combine the variety of available evidence, each one usually annotated by a specific weight factor, and, finally, map the weights to a single salience score (Rich and LuperFoy 1988; Haji~ovG KuboG and Kubo~ 1992; Lappin and Leass 1994) These heuristics helped to improve the performance of discourse-understanding systems through significant reductions of the available search-space for antecedents.</Paragraph>
      <Paragraph position="1"> Their major drawback is that they require a great deal of skilled hand-crafting that, unfortunately, usually does not scale in broader application domains. Hence, proposals were made to replace these high-level &amp;quot;symbolic&amp;quot; categories by statistically interpreted occurrence patterns derived from large text corpora (Dagan and Itai 1990). Preferences then reflect patterns of statistically significant lexical usage rather than introspective abstractions of linguistic patterns such as syntactic parallelism or pragmatic alignment.</Paragraph>
      <Paragraph position="2"> Among the heuristic approaches to anaphora resolution, those which consider the identification of heuristics a machine learning (ML) problem are particularly interesting, since their heuristics dynamically adapt to the textual data. Furthermore, ML procedures operate on incomplete parses (hence, they accept noisy data), which dis- null Computational Linguistics Volume 25, Number 3 tinguishes them from the requirements of perfect information and high data fidelity imposed by almost any other anaphora resolution scheme. Connolly, Burger, and Day (1994) treat anaphora resolution as an ML classification problem and compare seven classifier approaches with the solution quality of a naive hand-crafted algorithm whose heuristics incorporate the well-known agreement and recency indicators. Aone and Bennett (1996) outline an approach where they consider more than 60 features automatically obtained from the machinery of the host natural language processing system the learner is embedded in. The features under consideration include lexical ones like categories, syntactic ones like grammatical roles, semantic ones like semantic classes, and text positional ones, e.g., the distance between anaphor and antecedent. These features are packed in feature vectors--for each pair of an anaphor and its possible antecedent--and used to train a decision tree, employing Quinlan's C4.5 algorithm (Aone and Bennett 1996), or a whole battery of alternative classifiers in which hybrid variants yield the highest scores (Connolly, Burger, and Day 1994). Though still not fully worked out, it is interesting to note that in both studies ML-derived heuristics tend to outperform those that were carefully developed by human experts (similar results are reported by Cardie \[1992\] with respect to learning resolution heuristics for relative pronouns pertaining to a case-based learning procedure). This indicates, at least, that heuristically based methods using simple combinations of features benefit from being exposed to and having to adapt to training data. ML-based mechanisms might constitute an interesting perspective for the further tuning of ordering criteria for the forward-looking centers.</Paragraph>
      <Paragraph position="3"> These mixed heuristic approaches, using multidimensional metrics for ranking antecedent candidates, diverge from the assumption that underlies the centering model that a single type of criterion--the attentional state and its representation in terms of the backward- and forward-looking centers--is crucial for referent selection. By incorporating functional considerations in terms of the information structure of utterances into the centering model we actually enrich the types of knowledge that go into centered anaphora resolution decisions, i.e., we extend the &amp;quot;dimensionality&amp;quot; of the centering model, too. But unlike the numerical scoring approaches, our combination remains at the symbolic computation level, preserves the modularity of criteria, and, in particular, is linguistically justified. Although functional centering is not a complete theory of preferential anaphora resolution, one should clearly stress the different goals behind heuristics-based systems, such as the ones just discussed, and the model of centering. Heuristic approaches combine introspectively acquired descriptive evidence and attempt to optimize reference resolution performance by proper evidence &amp;quot;engineering&amp;quot;. This is often done in an admittedly ad hoc way, requiring tricky retuning when new evidence is added (Rich and LuperFoy 1988). On the other hand, many of these systems work in a real-world environment (Rich and LuperFoy 1988; Lappin and Leass 1994; Kennedy and Boguraev 1996) in which noisy data and incomplete, sometimes even faulty, analysis results have to be accounted for. The centering model differs from these considerations in that it aims at unfolding a unified theory of discourse coherence at the linguistic, attentional, and intentional level (Grosz and Sidner 1986); hence, the search for a more principled, theory-based solution, but also the need for (almost) perfect linguistic analyses in terms of parsing and semantic interpretation.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML