File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/x98-1028_metho.xml

Size: 23,896 bytes

Last Modified: 2025-10-06 14:15:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="X98-1028">
  <Title>A Text-Extraction Based Summarizer</Title>
  <Section position="2" start_page="224" end_page="225" type="metho">
    <SectionTitle>
SPIES JUST WOULDN'T COME IN FROM COLD
WAR, FILES SHOW
</SectionTitle>
    <Paragraph position="0"> Terry Squillacote was a Pentagon lawyer who hated her job. Kurt Stand was a union leader with an aging beatnik's slouch. Jim Clark was a lonely private investigator. \[A 200-page affidavit filed last week by\] the Federal Bureau of Investigation says the three were out-of-work spies for East Germany. And after that state withered away, it says, they desperately reached out for anyone who might want them as secret agents.</Paragraph>
    <Paragraph position="1"> In this example, the two passages are non-consecutive paragraphs in the original text; the string in the square brackets at the opening of the second passage has been omitted in the summary.</Paragraph>
    <Paragraph position="2"> Here the human summarizer's actions appear relatively straightforward, and it would not be difficult to propose an algorithmic method to do the same.</Paragraph>
    <Paragraph position="3"> This may go as follows:  1. Choose a DMS template for the summary; e.g.,  Background+News.</Paragraph>
    <Paragraph position="4"> Select appropriate passages from the originM text and fill the DMS template.</Paragraph>
    <Paragraph position="5"> Assemble the summary in the desired order; delete extraneous words.</Paragraph>
    <Paragraph position="6"> It is worth noting here that the backgroundcontext passage is critical for understanding of this summary, but as such provides essentially no relevant information except for the names of the people involved. Incidentally, this is precisely the information required to make the summary self-contained, if for no other reason than to supply the antecedents to the anaphors in the main passage (the three, they). The Algorithm The summarizer can work in two modes: generic and topical. In the generic mode, it simply summarizes the main points of the original document.</Paragraph>
    <Paragraph position="8"> In the topical mode, it takes a user supplied statement of interest, a topic, and derives a summary related to this topic. A topical summary is thus usually different from the generic summary of the same document. The summarizer can produce both indicative and informative summaries. An indicative summary, typically 5-10% of the original text, is when there is just enough material retained from the original document to indicate its content. An informative summary, on the other hand, typically 20-30% of the text, retains all the relevant facts that a user may need from the original document, that is, it serves as a condensed surrogate, a digest.</Paragraph>
    <Paragraph position="9"> The process of assembling DMS components into a summary depends upon the complexity of the discourse structure itself. For news or even for scientific texts, it may be just a matter of concatenating components together with a little of &amp;quot;cohesiveness glue&amp;quot;, which may include deleting some obstructing sentences, expanding acronyms, adjusting verb forms, etc. In a highly specialized domain (e.g., court rulings) the final assembly may be guided by a very detailed pattern or a script that conforms to specific style and content requirements.</Paragraph>
    <Paragraph position="10"> Below we present a 10-step algorithm for generating summaries of news-like texts. This is the algorithm underlying our current summarizer. The reader may notice that there is no explicit provision for dealing with DMS structures here. Indeed, the basic Background+News summary pattern has been tightly integrated into the passage selection and weighting process. This obviously streamlines the summarization process, but it also reflects the notion that news-style summarization is in many ways basic and subsumes other more complex summarization requirements.</Paragraph>
  </Section>
  <Section position="3" start_page="225" end_page="229" type="metho">
    <SectionTitle>
THE GENERALIZED SUMMARIZATION ALGORITHM
</SectionTitle>
    <Paragraph position="0"> sO: Segment text into passages. Use any available handles, including indentation, SGML, empty lines, sentence ends, etc. If no paragraph or sentence structure is available, use approximately equal size chunks.</Paragraph>
    <Paragraph position="1"> sl: Build a paragraph-search query out of the content words, phrases and other terms found in the title, a user-supplied topic description (if available), as well as the terms occurring frequently in the text. s2: Reconnect adjacent passages that display strong cohesiveness by one-way background links, using handles such as outgoing anaphors and other backward references. A background link from passage N/I to passage Nmeans that if passage N+I s3: s4: s5: s6: is selected for a summary, passage N must also be selected. Link consecutive passages until all references are covered.</Paragraph>
    <Paragraph position="2"> Score all passages, including the linked groups with respect to the paragraph-search query. Assign a point for each co-occurring term. The goal is to maximize the overlap, so multiple occurrences of the same term do not increase the score. Normalize passage scores by their length, taking into account the desired target length of the summary. The goal is to keep summary length as close to the target length as possible. The weighting formula is designed so that small deviations from the target length are acceptable, but large deviations will rapidly decrease the passage score. The exact formulation of this scheme depends upon the desired tradeoff between summary length and content. The following is the basic formula for scoring passage P of length I against the passagesearch query Q and the target summary length of t, as used in current version of our summarizer:</Paragraph>
    <Paragraph position="4"> with prem(P) as a cummulative non-content based score premium (cf s7).</Paragraph>
    <Paragraph position="5"> Discard all passages with length in excess of 1.5 times the target length. This reduces the number of passage combinations the summarizer has to consider, thus improving its efficiency. The decision whether to use this condition depends upon our tolerance to length variability. In extreme cases, to prevent obtaining empty summaries, the summarizer will default to the first paragraph of the original text.</Paragraph>
    <Paragraph position="6"> Combine passages into groups of 2 or more based on their content, composition and length. The goal is to maximize the score, while keeping the length as close to the target length as possible.  Any combination of passages is allowed, includ- (5) ing non-consecutive passages, although the original ordering of passages is retained. If a passage (6) attached to another through a background link is included into a group, the other passage must also be included, and this rule is applied recursively. We need to note that the background links work only one way: a passage which is a background for (7) another passage, may stand on its own if selected into a candidate summary.</Paragraph>
    <Paragraph position="7"> S7: Recalculate scores for all newly created groups. This is necessary, and cannot be obtained as a sum of scores because of possible term repetitions. Again, discard any passage groups longer than 1.5 times the target length. Add premium scores to groups based on the inverse degree of text discontinuity measured as a total amount of elided text material between the passages within a group. Add other premiums as applicable.</Paragraph>
    <Paragraph position="8"> s8: Rank passage groups by score. All groups become candidate summaries.</Paragraph>
    <Paragraph position="9"> s9: Repeat steps s6 through s8 until there is no change in top-scoring passage group through 2 consecutive iterations. Select the top scoring passage or passage group as the final summary.</Paragraph>
    <Paragraph position="10"> Implementation and some Examples The summarizer has been implemented in C++ with a Java interface as a demonstration system, primarily for news summarization. At this time it can run in both batch and interactive modes under Solaris, and it can also be accessed via Web using a Java compatible browser. Below, we present a few example summaries. For an easy orientation paragraphs are numbered in order they appear in the original text.</Paragraph>
    <Paragraph position="11"> TITLE: Mrs. Clinton Says U.S. Needs 'Ways That  (6) The United States, Mrs. Clinton said, must become &amp;quot;a nation that doesn't just talk about family values but acts in ways that values families.&amp;quot; SUMMARY TYPE: indicative TARGET LENGTH: 15% TOPIC: Hidden cameras used in news reporting (4) Roone Arledge, the president of ABC News, defended the  methods used to report the segment and said ABC would appeal the verdict.</Paragraph>
    <Paragraph position="12"> &amp;quot;They could never contest the truth&amp;quot; of the broadcast, Arledge said. &amp;quot;These people were doing awful things in these stores.&amp;quot; Wednesday's verdict was only the second time punitive damages had been meted out by a jury in a hidden-camera case. It was the first time punitive damages had been awarded against producers of such a segment, said Neville L. Johnson, a lawyer in Los Angeles who has filed numerous hidden-camera cases against the major networks.</Paragraph>
    <Paragraph position="13"> Many journalists argue that hidden cameras and other undercover reporting techniques have long been necessary tools for exposing vital issues of public policy and health. But many media experts say television producers have overused them in recent years in a push to create splashy shows and bolster ratings. The jurors, those experts added, may have been lashing out at what they perceived as undisciplined and overly aggressive news organizations.</Paragraph>
    <Paragraph position="14"> TITLE: U.S. Buyer of Russian Uranium Said to Put  In a postscript to the Cold War, the American governmentowned corporation that is charged with reselling much of Russia's military stockpile of uranium as civilian nuclear reactor fuel turned down repeated requests this year to buy material sufficient to build 400 Hiroshima-size bombs.</Paragraph>
    <Paragraph position="15"> The incident raises the question of whether the corporation, the U.S. Enrichment Corp., put its own financial interest ahead of the national-security goal of preventing weapons-grade uranium from falling into the hands of terrorists or rogue states. The corporation has thus far taken delivery from Russia of reactor fuel derived from 13 tons of bomb-grade uranium. &amp;quot;The nonproliferation objectives of the agreement are being achieved,&amp;quot; a spokesman for the Enrichment Corp. said. But since the beginning of the program, skeptics have questioned the wisdom of designating the Enrichment Corp. as Washington's &amp;quot;executive agent&amp;quot; in managing the deal with Russia's Ministry of Atomic Energy, or MINATOM.</Paragraph>
    <Paragraph position="16"> Domenici, chairman of the energy subcommittee of the Senate Appropriations Committee, which is shepherding the privatization plan through Congress, was never informed of the offer by the administration. After learning of the rebuff to the Russians, he wrote to Curtis asking that the Enrichment Corp. &amp;quot;be immediately replaced as executive agent&amp;quot; and warning that &amp;quot;under no circumstances should the sale of the USEC proceed until this matter is resolved.&amp;quot; Once Domenici entered the fray, the administration changed its tune.</Paragraph>
    <Paragraph position="17"> Curtis sent a letter to Domenici stating that all the problems blocking acceptance of the extra six tons had been solved. People close to the administration said that the Enrichment Corp. has now been advised to buy the full 18-ton shipment in 1997. Moreover, Curtis quickly convened a new committee to monitor the Enrichment Corp. for signs of foot-dragging.</Paragraph>
    <Section position="1" start_page="226" end_page="227" type="sub_section">
      <SectionTitle>
Evaluation
</SectionTitle>
      <Paragraph position="0"> Our program has been tested on a variety of news-like documents, including Associated Press news-wire messages, articles from the New York Times, The Wall Street Journal, Financial Times, San Jose Mercury, as well as documents from the Federal Register, and the Congressional Record. The summarizer is domain independent, and it can be easily adapted to most European languages. It is also  very robust: we used it to derive summaries of thousands of documents returned by an information retrieval system. Early results from these evaluations indicate that the summaries generated using our DMS method offer an excellent tradeoff between time/length and accuracy. Our summaries tend to be shorter and contain less extraneous material than those obtained using different methods. This is further confirmed by the favorable responses we received from the users.</Paragraph>
      <Paragraph position="1"> Thus far there has been only one systematic multi-site evaluation of summarization approaches, conducted in early 1998, organized by U.S. DARPA 1 in the tradition of Message Understanding Conferences (MUC) (DAR 1993) and Text Retrieval Conferences (TREC) (Harman 1997a), which have proven successful in stimulating research in their respective areas: information extraction and information retrieval. The summarization evaluation focused on content representativeness of indicative summaries and comprehensiveness of informative summaries.</Paragraph>
      <Paragraph position="2"> Other factors affecting the quality of summaries, such as brevity, readability, and usefulness were evaluated indirectly, as parameters of the main scores. For more details see (Firmin &amp; Sundheim 1998).</Paragraph>
      <Paragraph position="3"> The indicative summaries were scored for relevance to pre-selected topics and compared to the classification of respective full documents. In this evaluation, a summary was considered successful if it preserved the original document's relevance or nonrelevance to a topic. Moreover, the recall and precision scores were normalized by the length of the summary (in words) relative to the length of the original document, as well as by the clock time taken by the evaluators to reach their topic relevance decisions.</Paragraph>
      <Paragraph position="4"> The first normalization measured the degree of content compression provided by the summaries, while the second normalization was intended to gauge their readability. The results showed a strong correlation between these two measures, which may indicate that readability was in fact equated with meaningfulness, that is, hard to read summaries were quickly judged non-relevant.</Paragraph>
      <Paragraph position="5"> For all the participants the best summaries scored better than the fixed-length summaries. When normalized for length our summarizer had the highest score for best summaries and took the second place for fixed-length summaries. The F-scores for indicative topical summaries (best and fixed-length) were very close for all participants. Apparently it is easier to generate a topical summary then a general summary. Normalizing for length did move our score</Paragraph>
    </Section>
    <Section position="2" start_page="227" end_page="229" type="sub_section">
      <SectionTitle>
Agency
</SectionTitle>
      <Paragraph position="0"> up, but again, there was no significant difference between participants.</Paragraph>
      <Paragraph position="1"> The informative (topical) summaries were scored for their ability to provide answers to who, what, when, how, etc. questions about the topics. These questions were unknown to the developers, so systems could not directly extract facts to satisfy them. Again, scores were normalized for summary length, but no time normalization was used. This evaluation was done on a significantly smaller scale than for the indicative summaries, simply because scoring for question answering was more time consuming for the human judges than categorization decisions.</Paragraph>
      <Paragraph position="2"> This evaluation could probably be recast as categorization problem, if we only assumed that the questions in the test were the topics, and that a summary needs to be relevant to multiple topics.</Paragraph>
      <Paragraph position="3"> Informative summaries were generated using the same general algorithm with two modifications.</Paragraph>
      <Paragraph position="4"> First, the expected summary length was set at 30% of the original, following an observation by the conference organizers while evaluating human generated summaries. Second, since the completeness of an informative summary was judged on the basis of it containing satisfactory answers to questions which were not part of the topic specification, we added extra scores to passages containing possible answers: proper names (who, where) and numerics (when, how much). Finally, we note that the test data used for evaluation, while generally of news-like genre, varied greatly in content, style and the subject matter, therefore domain-independence was critical.</Paragraph>
      <Paragraph position="5"> Again our summarizer performed quite well, although the results are less significant since the experiment was carried out on such a small scale. The results were separated out for three different queries.</Paragraph>
      <Paragraph position="6"> For two queries the system was very close to the top performing system, and for the third query the system had an F-score of about 0.61 versus 0.77 for the best system.</Paragraph>
      <Paragraph position="7"> In general we are quite pleased with the summarizer performance, especially since our system was not trained on the kind of texts that we had to summarize. null Related Work and Future Work The current summarizer is still undergoing improvement and adaptation in order to be able to summarize more than a single text news document at a time. At the same time we are investigating how summarization can be used in related but different problems. Both will be described below.</Paragraph>
      <Paragraph position="8">  A better and more flexible summarizer Currently our summarizer is especially tuned for English one-document text-only news summarization.</Paragraph>
      <Paragraph position="9"> While we are still working on improving this, we also want the system to be able to summarize a wider variety of documents. Many challenges remain, including summarization of non-news documents, multi-modal documents (such as web pages), foreign language documents and (small or large) groups of documents covering one or more topics.</Paragraph>
      <Paragraph position="10"> Typically, a user needs summarization the most when dealing with a large number of documents.</Paragraph>
      <Paragraph position="11"> Therefore, the next logical step is to summarize more than one documents at a time. At the moment we are focusing on multi-document (crossdocument) summarization of English text-only news documents. Just as for single-document summarization, multi-document summarization can be generic or topical and indicative or informative. Other factors that will influence the types of summary are the number of documents (a large versus a small set) and the variety of topics discussed by the documents (are the documents closely related or can they cover very different topics). Presentation of a multi-document offers a wide variety of choices. One could create one large text summary that gives an overview of all the main issues mentioned in all summaries. Or perhaps give different short summaries for similar documents. If the number of documents is very large it might be best to create nested summaries with high-level descriptions and the possibility to 'zoom in' on a subgroup with a more specific summary. A user will probably want to have the ability to trace information in a summary back to its original document; source information should be a part of the summary. If one views summarization in the context of tracking a topic, the main goal of the summary might be to show the new information every next document contains, while not repeating information already mentioned in previous documents.</Paragraph>
      <Paragraph position="12"> Another type of summary might highlight the similarities documents have (e.g., all these documents are on protection of endangered species) and pointing out the differences they have (e.g., one on bald eagles, some on bengal tigers,..). As one can see, there are many questions to be answered and the answers depend partially on the task environment the summarizer will be used in.</Paragraph>
      <Paragraph position="13"> Currently we are focussing on summarizing a small set of text-only documents (around 20) all on a similar topic. The summary will reflect the main points/topics discussed by the documents. Topics discussed by more than one document should only be mentioned once in the summary together with its different sources. When generating the summary we want to ensure coherence by placing related topic close to each other. The main issues we are addressing is the detection of similar information in order to avoid repetition in the summary and the detection of related information in order to generated a coherent summary. This work is right now in progress.</Paragraph>
      <Paragraph position="14"> Our next step will be summarizing large amounts of similar information.</Paragraph>
      <Paragraph position="15"> Applying summarization to different problems Information retrieval (IR) is a task of selecting documents from a database in response to a user's query, and ranking these documents according to relevance.</Paragraph>
      <Paragraph position="16"> Currently we are investigating the usage of summarization in order to build (either automatically or with the help of the user) more effective information need statements for an automated document search system. The premise is quite simple: use the initial user's statement of information need to sample the database for documents, summarize the returned documents topically, then add selected summaries to the initial statement to make it richer and more specific. Adding appropriate summaries can be either done by the user who reads the summaries or automatically. Both approaches are described in our other paper appearing in this volume.</Paragraph>
      <Paragraph position="17"> The task of tracking a topic consists of identifying those information segments in a information stream that are relevant to a certain topic. Topic tracking is one of the three main tasks in the TDT (Topic Detection and Tracking) tasks that we hope to use our summarizer for. The information stream consists of news, either from a tv broadcast or a radio broadcast. Speech from these programs has been recognized by a state-of-the-art automatic speech recognition system and also transcribed by human transcriptionists. A topic is defined implicitly by a set of training stories that are given to be on this topic. The basic idea behind our approach is simple. We use the training stories to create a set of keywords (the query). Since we process continuous news the input is not segmented into paragraphs or any other meaningful text unit. Before applying our summarizer each story is divided into equal word-size segments. We summarize every story using our query, and use similarity of the summary to the query to decide whether a story is on topic or not. We are still in the process of refining our system and hope to have our first results soon. Initial results suggest that this is a viable approach. It is encouraging to notice that the absence of a paragraph structure does not prevent the system from generating useful  summaries.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML