File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-1612_abstr.xml

Size: 1,109 bytes

Last Modified: 2025-10-06 13:45:22

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1612">
  <Title>Learning Information Status of Discourse Entities</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper we address the issue of automatically assigning information status to discourse entities. Using an annotated corpus of conversational English and exploiting morpho-syntactic and lexical features, we train a decision tree to classify entities introduced by noun phrases as old, mediated, or new. We compare its performance with hand-crafted rules that are mainly based on morpho-syntactic features and closely relate to the guidelines that had been used for the manual annotation. The decision tree model achieves an overall accuracyof79.5%, significantlyoutperforming the hand-crafted algorithm (64.4%). We also experiment with binary classifications by collapsing in turn two of the three target classes into one and retraining the model. The highest accuracy achieved on binary classification is 93.1%.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML