File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/n06-2003_concl.xml

Size: 1,365 bytes

Last Modified: 2025-10-06 13:55:14

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-2003">
  <Title>Museli: A Multi-Source Evidence Integration Approach to Topic Segmentation of Spontaneous Dialogue</Title>
  <Section position="6" start_page="11" end_page="11" type="concl">
    <SectionTitle>
5 Current Directions
</SectionTitle>
    <Paragraph position="0"> In this paper we address the problem of automatic topic segmentation of spontaneous dialogue.</Paragraph>
    <Paragraph position="1"> We demonstrated with an empirical evaluation that state-of-the-art approaches fail on spontaneous dialogue because word-distribution patterns alone are insufficient evidence of topic shifts in dialogue.</Paragraph>
    <Paragraph position="2"> We have presented a supervised learning algorithm for topic segmentation of dialogue that combines linguistic features signaling a contribution's function with lexical cohesion. Our evaluation on two distinct dialogue corpora shows a significant improvement over the state of the art approaches.</Paragraph>
    <Paragraph position="3"> The disadvantage of our approach is that it requires hand-labeled training data. We are currently exploring ways of bootstrapping a model from a small amount of hand labeled data in combination with lexical cohesion (tuned for high precision and consequently low recall) and some reliable discourse markers.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML