File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/p05-1018_intro.xml

Size: 3,934 bytes

Last Modified: 2025-10-06 14:03:02

<?xml version="1.0" standalone="yes"?>
<Paper uid="P05-1018">
  <Title>Modeling Local Coherence: An Entity-based Approach</Title>
  <Section position="2" start_page="0" end_page="141" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> A key requirement for any system that produces text is the coherence of its output. Not surprisingly, a variety of coherence theories have been developed over the years (e.g., Mann and Thomson, 1988; Grosz et al. 1995) and their principles have found application in many symbolic text generation systems (e.g., Scott and de Souza, 1990; Kibble and Power, 2004). The ability of these systems to generate high quality text, almost indistinguishable from human writing, makes the incorporation of coherence theories in robust large-scale systems particularly appealing. The task is, however, challenging considering that most previous efforts have relied on handcrafted rules, valid only for limited domains, with no guarantee of scalability or portability (Reiter and Dale, 2000). Furthermore, coherence constraints are often embedded in complex representations (e.g., Asher and Lascarides, 2003) which are hard to implement in a robust application.</Paragraph>
    <Paragraph position="1"> This paper focuses on local coherence, which captures text relatedness at the level of sentence-to-sentence transitions, and is essential for generating globally coherent text. The key premise of our work is that the distribution of entities in locally coherent texts exhibits certain regularities. This assumption is not arbitrary -- some of these regularities have been recognized in Centering Theory (Grosz et al., 1995) and other entity-based theories of discourse.</Paragraph>
    <Paragraph position="2"> The algorithm introduced in the paper automatically abstracts a text into a set of entity transition sequences, a representation that reflects distributional, syntactic, and referential information about discourse entities. We argue that this representation of discourse allows the system to learn the properties of locally coherent texts opportunistically from a given corpus, without recourse to manual annotation or a predefined knowledge base.</Paragraph>
    <Paragraph position="3"> We view coherence assessment as a ranking problem and present an efficiently learnable model that orders alternative renderings of the same information based on their degree of local coherence. Such a mechanism is particularly appropriate for generation and summarization systems as they can produce multiple text realizations of the same underlying content, either by varying parameter values, or by relaxing constraints that control the generation process. A system equipped with a ranking mechanism, could compare the quality of the candidate outputs, much in the same way speech recognizers employ language models at the sentence level.</Paragraph>
    <Paragraph position="4"> Our evaluation results demonstrate the effectiveness of our entity-based ranking model within the general framework of coherence assessment. First, we evaluate the utility of the model in a text ordering task where our algorithm has to select a maximally coherent sentence order from a set of candidate permutations. Second, we compare the rankings produced by the model against human coherence judgments elicited for automatically generated summaries. In both experiments, our method yields  a significant improvement over a state-of-the-art coherence model based on Latent Semantic Analysis (Foltz et al., 1998).</Paragraph>
    <Paragraph position="5"> In the following section, we provide an overview of existing work on the automatic assessment of local coherence. Then, we introduce our entity-based representation, and describe our ranking model.</Paragraph>
    <Paragraph position="6"> Next, we present the experimental framework and data. Evaluation results conclude the paper.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML