File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/n06-2003_concl.xml
Size: 1,365 bytes
Last Modified: 2025-10-06 13:55:14
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-2003"> <Title>Museli: A Multi-Source Evidence Integration Approach to Topic Segmentation of Spontaneous Dialogue</Title> <Section position="6" start_page="11" end_page="11" type="concl"> <SectionTitle> 5 Current Directions </SectionTitle> <Paragraph position="0"> In this paper we address the problem of automatic topic segmentation of spontaneous dialogue.</Paragraph> <Paragraph position="1"> We demonstrated with an empirical evaluation that state-of-the-art approaches fail on spontaneous dialogue because word-distribution patterns alone are insufficient evidence of topic shifts in dialogue.</Paragraph> <Paragraph position="2"> We have presented a supervised learning algorithm for topic segmentation of dialogue that combines linguistic features signaling a contribution's function with lexical cohesion. Our evaluation on two distinct dialogue corpora shows a significant improvement over the state of the art approaches.</Paragraph> <Paragraph position="3"> The disadvantage of our approach is that it requires hand-labeled training data. We are currently exploring ways of bootstrapping a model from a small amount of hand labeled data in combination with lexical cohesion (tuned for high precision and consequently low recall) and some reliable discourse markers.</Paragraph> </Section> class="xml-element"></Paper>