File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/w02-0402_metho.xml

Size: 11,051 bytes

Last Modified: 2025-10-06 14:07:58

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0402">
  <Title>Selecting Sentences for Multidocument Summaries using Randomized Local Search</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 System Description
</SectionTitle>
    <Paragraph position="0"> We have implemented our randomized local search method for sentence selection as part of the RIPTIDES (White et al., 2001) system. RIPTIDES combines information extraction (IE) in the domain of natural disasters and multidocument summarization to produce hypertext summaries. The hypertext summaries include a high-level textual overview; tables of all comparable numeric estimates, organized to highlight discrepancies; and targeted access to supporting information from the original articles. In White et al. (2002), we showed that the hypertext summaries can help to identify disrepancies in numeric estimates, and provide a significantly more complete picture of the available information than the latest article. The next subsection walks through a sample hyper-text summary; it is followed by descriptions of the IE and Summarizer system components.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Example
</SectionTitle>
      <Paragraph position="0"> Figure 1 shows a textual overview of the first dozen or so articles in a corpus of news articles gathered from the web during the first week after the January 2001 earthquake in Central America. Clicking on the magnifying glass icon brings up the original article in the right frame, with the extracted sentences highlighted.</Paragraph>
      <Paragraph position="1"> The index to the hypertext summary appears in the left frame of figure 1. Links to the overview and to the lead sentences of the articles are followed by links to tables that display the base level extraction slots for the main event (here, an earthquake) including its description, date, location, epicenter and magnitude. Access to the overall damage estimates appears next, with separate tables for types of human effects (e.g. dead, missing) and for object types (e.g.</Paragraph>
      <Paragraph position="2"> villages, bridges, houses) with physical effects.</Paragraph>
      <Paragraph position="3"> Figure 2 shows the extracted estimates of the overall death toll. In order to help identify discrepancies, the high and low current estimates are shown at the top, followed by other current estimates and then all extracted estimates.</Paragraph>
      <Paragraph position="4"> Heuristics are used to determine which estimates to consider current, taking into account the source (either news source or attributed source), specificity (e.g. hundreds vs. at least 200) and confidence level, as indicated by the presence of hedge words such as perhaps or assumed. The tables also provide links to the original articles, allowing the user to quickly and directly determine the accuracy of any estimate in the table.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 IE System
</SectionTitle>
      <Paragraph position="0"> The IE system combines existing language technology components (Bikel et al., 1997; Charniak, 1999; Day et al., 1997; Fellbaum, 1998) in a traditional IE architecture (Cardie, 1997; Grishman, 1996). Unique features of the system include a weakly supervised extraction patternlearning component, Autoslog-XML, which is based on Autoslog-TS (Riloff, 1996), but operates in an XML framework and acquires patterns for extracting text elements beyond noun phrases (e.g. verb groups, adjectives, adverbs, and single-noun modifiers). In addition, a  heuristic-based clustering algorithm organizes the extracted concepts into output templates specifically designed to support multi-document summarization: the IE system, for example, distinguishes different reports or views of the same event from multiple sources (White et al., 2001). Output templates from the IE system for each text to be covered in the multi-document summary are provided as input to the summarization component along with all linguistic annotations accrued in the IE phase.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Summarizer
</SectionTitle>
      <Paragraph position="0"> The Summarizer operates in three main stages.</Paragraph>
      <Paragraph position="1"> In the first stage, the IE output templates are merged into an event-oriented structure where comparable facts are semantically grouped. Towards the same objective, surface-oriented clustering is used to group sentences from different documents into clusters that are likely to report similarcontent. Inthesecondstage, importance scores are assigned to the sentences based on the following indicators: position in document, document recency, presence of quotes, average sentence overlap, headline overlap, size of cluster (if any), size of semantic groups (if any), specificity of numeric estimates, and whether these estimates are deemed current. In the third and final stage, the hypertext summary is generated from the resulting content pool. Further details on each stage follow in the paragraphs below; see White et al. (2002) for a more complete description. null In the analysis stage, we use Columbia's SimFinder tool (Hatzivassiloglou et al., 2001) to obtain surface-oriented similarity measures and clusters for the sentences in the input articles.</Paragraph>
      <Paragraph position="2"> To obtain potentially more accurate partitions using the IE output, we semantically merge the extracted slots into comparable groups, i.e. ones whose members can be examined for discrepancies. This requires distinguishing (i) different types of damage; (ii) overall damage estimates vs. those that pertain to a specific locale; and (iii) damage due to related events, such as previous quakes in the same area. During this stage, we also analyze the numeric estimates for specificity and confidence level, and determine which estimates to consider current.</Paragraph>
      <Paragraph position="3"> In the scoring stage, SimFinder's similarity measures and clusters are combined with the semantic groupings obtained from merging the IE templates in order to score the input sentences. The scoring of the clusters and semantic groups is based on their size, and the scores are combined at the sentence level by including the score of all semantic groups that contain a phrase extracted from a given sentence.</Paragraph>
      <Paragraph position="4"> More precisely, the scores are assigned in three phases, according to a set of hand-tuned parameter weights. First, a base score is assigned to each sentence according to a weighted sum of the position in document, document recency, presence of quotes, average sentence overlap, and headline overlap. The average sentence overlap is the average of all pairwise sentence similarity measures; we have found this measure to be a useful counterpart to sentence position in reliably identifying salient sentences, with the other factors playing a lesser role. In the second scoring phase, the clusters and semantic groups are assigned a score according to the sum of the base sentence scores. After normalization in the third scoring phase, the weighted cluster and group scores are used to boost the base scores, thereby favoring sentences from the more important clusters and semantic groups. Finally, a small boost is applied for currenten and more specific numeric estimates.</Paragraph>
      <Paragraph position="5"> In the generation stage, the overview is constructed by selecting a set of sentences in a context-sensitive fashion, and then ordering the blocks of adjacent sentences according to their importance scores. The summarization scoring model begins with the sum of the scores for the candidate sentences, which is then adjusted to penalize the inclusion of multiple sentences from the same cluster or semantic group, or sentences whose similarity measure is above a certain threshold, and to favor the inclusion of adjacent sentences from the same article, in order to boost intelligibility. A larger bonus is applied when including a sentence that begins with an initial pronoun as well as the previous one, and an even bigger bonus is added when including a sentence that begins with a strong rhetorical marker (e.g. however) as well as its predecessor; corresponding penalties are also used when the preceding sentence is missing, or when a short sentence appears without an adjacent one.</Paragraph>
      <Paragraph position="6"> To select the sentences for the overview according to this scoring model, we use an iterative randomized local search procedure inspired by Selman and Kautz (1994). Two noise strategies are employed to lessen the problem of local maxima in the search space: (i) the local search is restarted from random starting points, for a fixed number of iterations, and (ii) during each local search iteration, greedy steps are interleaved with random steps, where a sentence is added regardless of its score. In the first local search iteration, the initial sentence collection consists of the highest scoring sentences up to the word limit. In subsequent iterations, the initial collection is composed of randomly selected sentences, weighted according to their scores, up to the word limit. During each local search iteration, a random step or a greedy step (chosen at random) is repeatedly performed until a greedy step fails to improve upon the current collection of sentences. In each greedy step, one sentence is chosen to add to the collection, and zero or more (typically one) sentences are chosen to remove from the collection, such that the word limit is still met, and this combination of sentences represents the best swap available according to the scoring model. After the predetermined number of iterations, the best combination of sentences found during the search is output; note that the algorithm could easily be formulated in an anytime fashion as well. From a practical perspective, we have found that 10 iterations often suffices to find a reasonable collection of sentences, taking well under a minute on a desktop PC.</Paragraph>
      <Paragraph position="7"> Once the overview sentences have been selected, the hypertext summary is generated as a collection of HTML files, using a series of XSLT transformations.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.4 Training and Tuning
</SectionTitle>
      <Paragraph position="0"> For the evaluation below, the IE system was trained on 12 of 25 texts from topic 89 of the TDT2 corpus, a set of newswires that describe the May 1998 earthquake in Afganistan. It achieves 42% recall and 61% precision when evaluated on the remaining 13 topic 89 texts.</Paragraph>
      <Paragraph position="1"> The parameter settings of the Summarizer were chosen by hand using the complete TDT2 topic 89 document set as input.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML