File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-1438_intro.xml

Size: 3,764 bytes

Last Modified: 2025-10-06 14:01:05

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1438">
  <Title>An Efficient Text Summarizer Using Lexical Chains</Title>
  <Section position="2" start_page="0" end_page="268" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Automatic text summarization has long been viewed as a two-step process. First, an intermediate representation of the summary must be created. Second, a natural language representation of the summary must be generated using the intermediate representation(Sparek Jones, 1993). Much of the early research in automatic text summarization has involved generation of the intermediate representation. The natural language generation problem has only recently received substantial attention in the context of summarization.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.1 Motivation
</SectionTitle>
      <Paragraph position="0"> In order to consider methods for generating natural text summaries from large documents, several issues must be examined in detail. First, an analysis of the quality of the intermediate representation for use in generation must be examined. Second, a detailed examination of the processes which link the intermediate representation to a potential final summary must be undertaken.</Paragraph>
      <Paragraph position="1"> The system presented here provides a useful first step towards these ends. By developing a robust and efficient tool to generate these intermediate representations, we can both evaluate the representation ......... andcormider the difficult problem of generatiiig natural language texts from the representation.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="268" type="sub_section">
      <SectionTitle>
1.2 Background Research
</SectionTitle>
      <Paragraph position="0"> Much research has been conducted in the area of automatic text summarization. Specifically, research using lexical chains and related techniques has received much attention.</Paragraph>
      <Paragraph position="1"> Early methods using word frequency counts did not consider the relations between similar words.</Paragraph>
      <Paragraph position="2"> Finding the aboutness of a document requires finding these relations. How these relations occur within a document is referred to as cohesion (Halliday and Hasan, 1976). First introduced by Morris and Hirst (1991), lexical chains represent lexical cohesion among related terms within a corpus. These relations can be recognized by identifying arbitrary size sets of words which are semantically related (i.e., have a sense flow). These lexical chains provide an interesting method for summarization because their recognition is easy within the source text and vast knowledge sources are not required in order to con&gt; pure them.</Paragraph>
      <Paragraph position="3"> Later work using lexical chains was conducted by Hirst and St-Onge (1997) using lexical chains to correct malapropisms. They used WordNet, a lexical database which contains some semantic information (http://www.cs.princeton.edu/wn).</Paragraph>
      <Paragraph position="4"> Also using WordNet in their implenmntation.</Paragraph>
      <Paragraph position="5"> Barzilay and Elhadad (1997) dealt with some of tile limitations in Hirst and St-Onge's algorithm by examining every possible lexical chain which could be computed, not just those possible at a given point in the text. That is to say, while Hirst and St.Onge would compute the chain in which a word should be placed when a word was first encountered, Barzilay and Elhadad computed ever:,' possible chain a word could become a member of when the word was encountered, and later determined the best interpretation. null</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML