File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1050_intro.xml

Size: 2,466 bytes

Last Modified: 2025-10-06 14:02:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1050">
  <Title>Evaluating Centering-based metrics of coherence for text structuring using a reliably annotated corpus</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
3 Centering-based metrics of
</SectionTitle>
    <Paragraph position="0"> coherence As said previously, we assume a text structuring system taking as input a set of utterances represented in terms of their CF lists. The system orders these utterances by applying a bias in favour of the best scoring ordering among the candidate solutions for the preferred output.8 In this section, we discuss how the Centering 8Additional assumptions for choosing between the orderings that are assigned the best score are presented in the next section.</Paragraph>
    <Paragraph position="1"> concepts just described can be used to define metrics of coherence which might be useful for text structuring.</Paragraph>
    <Paragraph position="2"> The simplest way to define a metric of coherence using notions from Centering is to classify each ordering of propositions according to the number of nocbs it contains, and pick the ordering with the fewest nocbs. We call this metric M.NOCB, following (Karamanis and Manurung, 2002). Because of its simplicity, M.NOCB serves as the baseline metric in our experiments. We consider three more metrics. M.CHEAP is biased in favour of the ordering with the fewest violations of cheapness. M.KP sums up the nocbs and the violations of cheapness, coherence and salience, preferring the ordering with the lowest total cost (Kibble and Power, 2000). Finally, M.BFP employs the preferences between standard transitions as expressed by Rule 2. More specifically, M.BFP selects the ordering with the highest number of continues. If there exist several orderings which have the most continues, the one which has the most retains is favoured. The number of smooth-shifts is used only to distinguish between the orderings that score best for continues as well as for retains, etc.</Paragraph>
    <Paragraph position="3"> In the next section, we present a general methodology to compare these metrics, using the actual ordering of clauses in real texts of a corpus to identify the metric whose behavior mimics more closely the way these actual orderings were chosen. This methodology was implemented in a program called the System for Evaluating Entity Coherence (seec).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML