File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1110_intro.xml

Size: 2,319 bytes

Last Modified: 2025-10-06 14:02:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1110">
  <Title>Semantic Similarity Applied to Spoken Dialogue Summarization</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Research in automatic text summarization began in the late 1950s and has been receiving more attention again over the last decade. The maturity of this research area is indicated by recent large-scale evaluation efforts (Radev et al., 2003). In comparison, speech summarization is a rather new research area which emerged only a few years ago. However, the demand for speech summarization is growing because of the increasing availability of (digitally encoded) speech databases (e.g. spoken news, political speeches).</Paragraph>
    <Paragraph position="1"> Our research is concerned with the development of a system for automatically generating summaries of conversational speech. As a potential application we envision the automatic generation of meeting minutes. The approach to spoken dialogue summarization presented herein unifies corpus- and knowledge-based approaches to summarization, i.e.</Paragraph>
    <Paragraph position="2"> we develop a shallow knowledge-based approach.</Paragraph>
    <Paragraph position="3"> Our system employs a set of semantic similarity metrics which utilize WordNet as a knowledge source. We claim that semantic similarity between a given utterance and the dialogue as a whole is an appropriate criterion for the selection of utterances which carry the essential content of the dialogue, i.e. relevant utterances. - In order to study the performance of semantic similarity methods, we remove the noise from the pre-processing modules by manually disambiguating lexical noun senses.</Paragraph>
    <Paragraph position="4"> In Section 2, we briefly describe research on summarization and how spoken dialogue summarization differs from text summarization. Section 3 gives the semantic similarity metrics we use and describes how they are applied to the summarization problem.</Paragraph>
    <Paragraph position="5"> Section 4 provides information about the data used in our experiments, while Section 5 describes the experiments and the results together with their statistical significance.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML