File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/p05-1037_intro.xml

Size: 2,902 bytes

Last Modified: 2025-10-06 14:03:03

<?xml version="1.0" standalone="yes"?>
<Paper uid="P05-1037">
  <Title>Digesting Virtual &amp;quot;Geek&amp;quot; Culture: The Summarization of Technical Internet Relay Chats</Title>
  <Section position="3" start_page="298" end_page="298" type="intro">
    <SectionTitle>
2 Previous and Related Work
</SectionTitle>
    <Paragraph position="0"> There are at least two ways of organizing dialogue summaries: by dialogue structure and by topic.</Paragraph>
    <Paragraph position="1"> Newman and Blitzer (2002) describe methods for summarizing archived newsgroup conversations by clustering messages into subtopic groups and extracting top-ranked sentences per subtopic group based on the intrinsic scores of position in the cluster and lexical centrality. Due to the technical nature of our working corpus, we had to handle intra-message topic shifts, in which the author of a message raises or responds to multiple issues in the same message. This requires that our clustering component be not message-based but submessage-based. null Lam et al. (2002) employ an existing summarizer for single documents using preprocessed email messages and context information from previous emails in the thread.</Paragraph>
    <Paragraph position="2"> Rambow et al. (2004) show that sentence extraction techniques are applicable to summarizing email threads, but only with added email-specific features. Wan and McKeown (2004) introduce a system that creates overview summaries for ongoing decision-making email exchanges by first detecting the issue being discussed and then extracting the response to the issue. Both systems use a corpus that, on average, contains 190 words and 3.25 messages per thread, much shorter than the ones in our collection.</Paragraph>
    <Paragraph position="3"> Galley et al. (2004) describe a system that identifies agreement and disagreement occurring in human-to-human multi-party conversations. They utilize an important concept from conversational analysis, adjacent pairs (AP), which consists of initiating and responding utterances from different speakers. Identifying APs is also required by our research to find correspondences from different chat participants.</Paragraph>
    <Paragraph position="4"> In automatic summarization of spoken dialogues, Zechner (2001) presents an approach to obtain extractive summaries for multi-party dialogues in unrestricted domains by addressing intrinsic issues specific to speech transcripts. Automatic question detection is also deemed important in this work. A decision-tree classifier was trained on question-triggering words to detect questions among speech acts (sentences). A search heuristic procedure then finds the corresponding answers.</Paragraph>
    <Paragraph position="5"> Ries (2001) shows how to use keyword repetition, speaker initiative and speaking style to achieve topical segmentation of spontaneous dialogues.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML