File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/p96-1057_metho.xml

Size: 10,317 bytes

Last Modified: 2025-10-06 14:14:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="P96-1057">
  <Title>Processing Complex Sentences in the Centering Framework</Title>
  <Section position="4" start_page="0" end_page="378" type="metho">
    <SectionTitle>
2 Constraints on Sentential Anaphora
</SectionTitle>
    <Paragraph position="0"> Our studies on German texts have revealed that the functional information structure of the sentence, considered in terms of the context-boundedness of discourse elements, is the major determinant for the ranking on the forward-looking-centers (C! (U,)) (Strube &amp; Hahn, 1996). Hence, context-bound discourse elements are generally ranked higher in the C! than any other non-anaphoric element. The functional information structure has impact not only on the resolution of inter-sentential anaphora, but also on the resolution of intra-sentential anaphora. Hence, the most preferred antecedent of an intra-sentential anaphor is a phrase which is also anaphoric. Consider sentences (1) and (2) and the corresponding centering data in Table 1 (Cb: backward-looking center; the first dement of the pairs denotes the discourse entity, the second element the surface). In sentence (1), a nominal anaphor occurs, der T3100SX (a particular notebook). In sentence (2), another nominal anaphor appears, der Rechner (the computer), which is resolved to T3100SX from the previous sentence. In the matrix clause, the pronoun er (it) co-specifies the already resolved anaphor  der Rechner in the subordinate clause.</Paragraph>
    <Paragraph position="1"> (1) Ist der Resume-Modus aktiviert, schaltet sich der T3100SX selbstiindig ab.</Paragraph>
    <Paragraph position="2"> (If the resume mode is active, - switches - itself - the T3100SX- automatically - off.) (2) Bei spiiterem Einschalten des Rechners arbeitet er sofort an der alten Smile weiter.</Paragraph>
    <Paragraph position="3"> (The - later - turning on - of the computer - it - resumes working - at exactly the same place.) (1) Cb: T3100SX: T3100SX Cf: \[1&amp;quot;3100SX: T3100SX\] (2) Cb: T3100SX: er Cf: \[T3100SX: er, TURN-ON: Einschalten, PLACE: Smile\]  This example illustrates our hypothesis that intra-sentential anaphors preferably co-specify context-bound discourse elements. In order to empirically  strenghten this argument, we have examined several texts of different types: 15 texts from the information technology (IT) domain, one text from the German news magazine Der Spiegel, and the first chapters of a short story by the German writer Heiner Miiller 2 (cf.  cur, 58 of them (89,2%) have an antecedent which is a resolved anaphor, while only 32 of them (49,2%) have an antecedent which is the subject of the matrix clause (cf. Table 3). These data indicate that an approach based on grammatical roles (Sm'i &amp; McCoy, 1994) is inappropriate for the German language, while an approach based on the functional information structure seems preferable. In addition, we maintain that exchanging grammatical with functional criteria is also a reasonable strategy for fixed word order languages. They can be rephrased in terms of functional criteria, simply due to the fact that grammatical roles and the information structure patterns we defined, unless marked, coincide in these languages.</Paragraph>
    <Paragraph position="4">  Since the strategy described above is valid only for complex sentences which consist of a matrix clause and one or more subordinate clauses, compound sentences which consist of main clauses must be considered. Each of these sentences is processed by our algorithm in linear order, one clause at a time with the usual centering operations. Compound sentences which consist of multiple full clauses also have multiple Cb/C! data.</Paragraph>
    <Paragraph position="5"> Now, we are able to define the expression utterance in a satisfactory manner: An utterance U is a simple sentence, a complex sentence, or each full clause of a compound sentence 3. The C! of an utterance is computed only with respect to the matrix clause. Given these findings, complex sentences can be processed at three stages (2a-2c; transitions from one stage to the  next occur only when a suitable antecedent has not been found at the previous stage): 2Liebesgeschichte. In Heiner Mflller, Geschichten aus der Produktion 2, Berlin: Rotbuch Verlag, pp.57-63. aWe do not consider dialogues with elliptical utterances. 1. For resolving an anaphor in the first clause of Un, propose the dements of Cy (Un-1) in the given order.</Paragraph>
    <Paragraph position="6"> 2. For resolving an anaphor in a subsequent clause of U,, (a) propose already context-bound elements of Un from left to right 4.</Paragraph>
    <Paragraph position="7"> (b) propose the dements of C:(Un-1) in the given order.</Paragraph>
    <Paragraph position="8"> (c) propose all dements of Un not yet checked from left to right.</Paragraph>
    <Paragraph position="9"> 3. Compute the C! (Un), considering only the elements of the matrix clause of Un.</Paragraph>
  </Section>
  <Section position="5" start_page="378" end_page="379" type="metho">
    <SectionTitle>
3 Evaluation
</SectionTitle>
    <Paragraph position="0"> In order to evaluate the functional approach to the resolution of intra-sentential anaphora within the centering model, we compared it to the other approaches mentioned in Section 1, employing the test set referred to in Table 2. Note that we tried to eliminate error chaining and false positives (for some remarks on evaluating discourse processing algorithms, cf. Walker (1989); we consider her results as a starting point for our proposal).</Paragraph>
    <Paragraph position="1"> First, we examine the errors which all strategies have in common (for the success rate, cf. Table 4).</Paragraph>
    <Paragraph position="2"> 99 errors are caused by underspecification at different levels, e.g., prepositional anaphors (16), plural anaphors (8), anaphors which refer to a member of a set (14), sentence anaphors (21), and anaphors which refer to a global focus (12) are not yet included in the mechanism. In 9 cases, any strategy will choose the false antecedent.</Paragraph>
    <Paragraph position="3"> The most interesting cases are the ones for which the performance of the different strategies varies. The linear approach generates 40 additional errors in the anaphora resolution, which are caused only by the ordering strategy to process each clause of sentences with the centering mechanism. The approach which prefers inter-sentential anaphora causes 60 additional errors. Note that this strategy performs remarkably well at first sight. For 44 of the errors it chooses an inter-sentential antecedent which is, on the surface, identical to the correct intra-sentential antecedent. We count these 44 resolutions as false positives, since the anaphor has been resolved to the false discourse entity. The approach which prefers intra-sentential antecedents causes 27 additional errors. These errors occur whenever an inter-sentential anaphor can be resolved with an incorrect intra-sentential antecedent.</Paragraph>
    <Paragraph position="4"> 4We abstract here from the syntactic criteria for filtering out some elements of the current sentence by applying bindhag criteria (Strube &amp; Hahn, 1995). Syntactic constraints like control phenomena override the preferences given by the context.</Paragraph>
    <Paragraph position="5">  The functional approach causes only 3 additional errors. These errors occur whenever the antecedent of an intra-sentential anaphor is not bound by the context (which is possible but rare) and when the anaphor can be resolved at the text level.</Paragraph>
    <Paragraph position="6"> The results change slightly if semantic/conceptual constraints (type and further admissibility constraints) on anaphora are considered. 22 errors of the linear approach, 8 errors of the approach which prefers inter-sentential antecedents, and 12 errors of the approach which prefers inter-sentential antecedents can be avoided. Only 6 errors of the functional approach can be avoided by incorporating semantic criteria.</Paragraph>
    <Paragraph position="7"> This might constitute a cognitively valid argument for the functional approach - the better the strategy, the lower the influence of semantics or world knowledge on anaphora resolution.</Paragraph>
    <Paragraph position="8"> To summarize the results of our empirical evaluation, we claim that our proposal based on functional criteria leads to substantively better results for languages with free word order than the linear approach suggested by Grosz et al. (1995) and the two approaches which prefer inter-sentential or intra-sentential antecedents.</Paragraph>
  </Section>
  <Section position="6" start_page="379" end_page="379" type="metho">
    <SectionTitle>
4 Comparison to Related Work
</SectionTitle>
    <Paragraph position="0"> Crucial for the evaluation of the centering model (Grosz et al., 1995) and its applicability to naturally occurring discourse is the lack of a specification conceming how to handle complex sentences and intra-sentential anaphora. Grosz et al. suggest the processing of sentences linearly one clause at a time.</Paragraph>
    <Paragraph position="1"> We have shown that such an approach is not appropriate for some types of complex sentences. Suri &amp; McCoy (1994) argue in the same manner, but we consider the functional approach for languages with free word order superior to their grammatical criteria, while, for languages with fixed word order, both approaches should give the same results. Hence, our approach seems to be more generally applicable. Other approaches which integrate the resolution of sentenceand text-level anaphora are based on salience metrics (Haji~ov~i et al., 1992; Lappin &amp; Leass, 1994). We consider such metrics to be a method which detracts from the exact linguistic specifications as we propose them.</Paragraph>
    <Paragraph position="2"> At first sight, grammar theories like GB (Chomsky, 1981) or I-IPSG (Pollard &amp; Sag, 1994), are the best choice for resolving anaphora at the sentence-level.</Paragraph>
    <Paragraph position="3"> But these grammar theories only give filters for excluding some elements from consideration. Neither gives any preference for a particular antecedent at the sentence-level, nor do they consider text anaphora.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML