XML Viewer - p04-1049

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1049_intro.xml
Size: 7,331 bytes
Last Modified: 2025-10-06 14:02:21
<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1049">
  <Title>Paragraph-, word-, and coherence-based approaches to sentence ranking: A comparison of algorithm and human performance</Title>
  <Section position="3" start_page="0" end_page="2" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Automatic generation of text summaries is a natural language engineering application that has received considerable interest, particularly due to the ever-increasing volume of text information available through the internet. The task of a human generating a summary generally involves three subtasks (Brandow et al. (1995); Mitra et al.</Paragraph>
    <Paragraph position="1"> (1997)): (1) understanding a text; (2) ranking text pieces (sentences, paragraphs, phrases, etc.) for importance; (3) generating a new text (the summary). Like most approaches to summarization, we are concerned with the second subtask (e.g. Carlson et al. (2001); Goldstein et al. (1999); Gong &amp; Liu (2001); Jing et al. (1998); Luhn (1958); Mitra et al. (1997); Sparck-Jones &amp; Sakai (2001); Zechner (1996)). Furthermore, we are concerned with obtaining generic rather than query-relevant importance rankings (cf. Goldstein et al. (1999), Radev et al. (2002) for that distinction).</Paragraph>
    <Paragraph position="2"> We evaluated different approaches to sentence ranking against human sentence rankings. To obtain human sentence rankings, we asked people to read 15 texts from the Wall Street Journal on a wide variety of topics (e.g. economics, foreign and domestic affairs, political commentaries). For each of the sentences in the text, they provided a ranking of how important that sentence is with respect to the content of the text, on an integer scale from 1 (not important) to 7 (very important). The approaches we evaluated are a simple paragraph-based approach that serves as a baseline, two word-based algorithms, and two coherence-based approaches</Paragraph>
    <Section position="1" start_page="0" end_page="1" type="sub_section">
      <SectionTitle>
2.1 Paragraph-based approach
</SectionTitle>
      <Paragraph position="0"> Sentences at the beginning of a paragraph are usually more important than sentences that are further down in a paragraph, due in part to the way people are instructed to write. Therefore, probably the simplest approach conceivable to sentence ranking is to choose the first sentences of each  We did not use any machine learning techniques to boost performance of the algorithms we tested.</Paragraph>
      <Paragraph position="1"> Therefore performance of the algorithms tested here will almost certainly be below the level of performance that could be reached if we had augmented the algorithms with such techniques (e.g. Carlson et al. (2001)). However, we think that a comparison between 'bare-bones' algorithms is viable because it allows to see how performance differs due to different basic approaches to sentence ranking, and not due to potentially different effects of different machine learning algorithms on different basic approaches to sentence ranking. In future research we plan to address the impact of machine learning on the algorithms tested here.</Paragraph>
      <Paragraph position="2"> paragraph as important, and the other sentences as not important. We included this approach merely as a simple baseline.</Paragraph>
    </Section>
    <Section position="2" start_page="1" end_page="2" type="sub_section">
      <SectionTitle>
2.2 Word-based approaches
</SectionTitle>
      <Paragraph position="0"> Word-based approaches to summarization are based on the idea that discourse segments are important if they contain &amp;quot;important&amp;quot; words. Different approaches have different definitions of what an important word is. For example, Luhn (1958), in a classic approach to summarization, argues that sentences are more important if they contain many significant words. Significant words are words that are not in some predefined stoplist of words with high overall corpus frequency  .</Paragraph>
      <Paragraph position="1"> Once significant words are marked in a text, clusters of significant words are formed. A cluster has to start and end with a significant word, and fewer than n insignificant words must separate any two significant words (we chose n = 3, cf. Luhn (1958)). Then, the weight of each cluster is calculated by dividing the square of the number of significant words in the cluster by the total number of words in the cluster. Sentences can contain multiple clusters. In order to compute the weight of a sentence, the weights of all clusters in that sentence are added. The higher the weight of a sentence, the higher is its ranking.</Paragraph>
      <Paragraph position="2"> A more recent and frequently used word-based method used for text piece ranking is tf.idf (e.g. Manning &amp; Schuetze (2000); Salton &amp; Buckley (1988); Sparck-Jones &amp; Sakai (2001); Zechner (1996)). The tf.idf measure relates the frequency of words in a text piece, in the text, and in a collection of texts respectively. The intuition behind tf.idf is to give more weight to sentences that contain terms with high frequency in a document but low frequency in a reference corpus. Figure 1 shows a formula for calculating tf.idf, where ds ij is the tf.idf weight of sentence i in document j, n si is the number of words in sentence i, k is the kth word in sentence i, tf jk is the frequency of word k in document j, n d is the number of documents in the reference corpus, and df k is the number of documents in the reference corpus in which word k appears.</Paragraph>
      <Paragraph position="4"> Instead of stoplists, tf.idf values have also been used to determine significant words (e.g. Buyukkokten et al.</Paragraph>
      <Paragraph position="5"> (2001)).</Paragraph>
      <Paragraph position="6"> We compared both Luhn (1958)'s measure and tf.idf scores to human rankings of sentence importance. We will show that both methods performed remarkably well, although one coherence-based method performed better.</Paragraph>
    </Section>
    <Section position="3" start_page="2" end_page="2" type="sub_section">
      <SectionTitle>
2.3 Coherence-based approaches
</SectionTitle>
      <Paragraph position="0"> The sentence ranking methods introduced in the two previous sections are solely based on layout or on properties of word distributions in sentences, texts, and document collections. Other approaches to sentence ranking are based on the informational structure of texts. With informational structure, we mean the set of informational relations that hold between sentences in a text. This set can be represented in a graph, where the nodes represent sentences, and labeled directed arcs represent informational relations that hold between the sentences (cf. Hobbs (1985)). Often, informational structures of texts have been represented as trees (e.g. Carlson et al. (2001), Corston-Oliver (1998), Mann &amp; Thompson (1988), Ono et al. (1994)). We will present one coherence-based approach that assumes trees as a data structure for representing discourse structure, and one approach that assumes less constrained graphs. As we will show, the approach based on less constrained graphs performs better than the tree-based approach when compared to human sentence rankings.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML