File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-2222_metho.xml
Size: 11,199 bytes
Last Modified: 2025-10-06 14:15:03
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2222"> <Title>Using Leading Text for News Summaries: Evaluation Results and Implications for Commercial Summarization Applications</Title> <Section position="4" start_page="1364" end_page="1364" type="metho"> <SectionTitle> LEAD(CLINTON) AND BUDGET </SectionTitle> <Paragraph position="0"> This query will retrieve only those documents that contain &quot;Clinton&quot; in the LEAD and &quot;budget&quot; anywhere in the document. Customers who use LEAD routinely combine it with the HEADLINE field.</Paragraph> <Paragraph position="1"> We tested 20 queries on a database that contains 20 million documents from more than I0,000 English language news publications. Each query was applied to the HEADLINE and BODY fields (abbreviated here as I-IBODY) and to the HEADLINE and LEAD fields (I-ILEAD). Queries were limited by date in order to reduce the magnitude of the evaluation task. In order to obtain a more complete picture of recall, other queries were used to identify relevant documents that the tested queries missed.</Paragraph> <Paragraph position="2"> The results in Table 1 show that limiting Boolean queries to leading text can help Searchable LEAD's targeted customers.</Paragraph> <Section position="1" start_page="1364" end_page="1364" type="sub_section"> <SectionTitle> lean Retrieval Quality (20-query test) </SectionTitle> <Paragraph position="0"> Searchable LEAD document processing software consists of a 500-statement PL1 program and a 23rule sentence and paragraph boundary recognition grammar, and operates in a mainfiame MVS environment. Searchable LEAD processes over 500,000 characters (90 news documents) per CPU second.</Paragraph> </Section> </Section> <Section position="5" start_page="1364" end_page="1364" type="metho"> <SectionTitle> 3 Related Work </SectionTitle> <Paragraph position="0"> There is a growing body of research into approaches for generating text summaries, including approaches based on sentence extraction (Kupiec et al., 1995), text generation from templates (McKeown and Radev, 1995) and machine-assisted abstraction (Tsou et al., 1992). Brandow et al. (1995) reported on a sentence extraction approach called the Automarie News Extraction System, or ANES. ANES combined statistical corpus analysis, signature word selection and sentence weighting to select sentences for inclusion in summaries. By varying the number of sentences selected, ANES-generated extracts could meet targeted summary lengths.</Paragraph> <Paragraph position="1"> ANES was evaluated using a corpus of 250 documents from newswire, magazine and newspaper publications. ANES was used to generate three summaries for each document, targeting summary lengths of 60, 150 and 250 words. For a baseline comparison, a modified version of the Searchable LEAD software was used to create three fixed length leading text summaries for each document, also targeting lengths of 60, 150 and 250 words.</Paragraph> <Paragraph position="2"> News analysts read each document and its corresponding summaries, and rated the summaries on their acceptability. Table 2 shows the results for each approach. Overall, 74% of the ANES summaries were judged to be acceptable. Unexpectedly, the acceptability rate for leading text summaries was significantly higher. Overall, 92% of the leading text summaries were judged to be acceptable.</Paragraph> <Paragraph position="3"> tween ANES and Leading Text The results for both approaches showed a promising start towards the goal of creating summaries for news documents. However, those results also raised questions about leading text. We wanted to better understand the value of leading texts as general purpose news document summaries.</Paragraph> </Section> <Section position="6" start_page="1364" end_page="1365" type="metho"> <SectionTitle> 4 Methodology </SectionTitle> <Paragraph position="0"> Our investigation had two goals: to verify on a larger scale the results that Brandow et al. (1995) suggested for leading text, and to determine whether there are easily definable indicators of where leading text extracts fare poorly as general purpose news document summaries.</Paragraph> <Paragraph position="1"> We used the Searchable LEAD definition of leading text as our summaries. The LEAD fields vary in length based on overall document length, which we believe helps them capture the logical lead. Also, LEAD fields already existed in our news documents in support of another application, Boolean retrieval. We did not modify Searchable LEAD sottware or any LEAD fields for this investigation.</Paragraph> <Paragraph position="2"> The test corpus consisted of 2,727 documents from more than 100 English language news publications. Documents were retrieved from our news database using several queries. Some queries were biased towards longer documents or to sources that provide transcripts. We believed that LEADs for such documents would pose more problems than would LEADs for typical news stories, based on past informal observations of LEAD fields. Because of the query bias, the test corpus does not represent our news database. For example, only 5.5% of the documents in the test corpus were less than 120 words long, whereas 18% of the documents in our news database are that short. Newspapers provide almost 60% of the documents in our news database but only a third of the test corpus documents.</Paragraph> <Paragraph position="3"> In order to investigate where LEADs might fail as summaries, we assigned attributes to each document that allowed us to examine various subsets of the test corpus. Attributes included the following: * BODY field and LEAD field word counts Source type (newspaper, wire service, newsletter, magazine, transcript service) Subject matter (biographical, financial, legal, legal news, other news, reviews, scientific) Document type (general news, which includes standard news articles, graphics, editorials, LEAD=BODY, letters/Q&A columns, and music and book reviews; lists; newsbriefs; and television program transcripts) * United States or non-United States source News analysts read each document and rated its corresponding LEAD field on its acceptability as a general purpose summary for that document. They rated the LEADs as either acceptable or unacceptable. Ratings were linked to document attributes in an evaluation file that contained one record for each document. This file was analyzed to obtain descriptive information about the test corpus and to compare attributes and ratings.</Paragraph> </Section> <Section position="7" start_page="1365" end_page="1366" type="metho"> <SectionTitle> 5 Results </SectionTitle> <Paragraph position="0"> Overall, 82.3% of LEADs were rated acceptable as summaries. However, because of differences between test corpus content and the content of our news database, this acceptability rate is not an over-all indicator for our news database.</Paragraph> <Paragraph position="1"> Document type was the most distinguishing attribute for identifying potential problem LEADs. For the general news document type, 94.1% of LEADs were rated acceptable as summaries. Acceptability rates were much lower for lists, newsbriefs and transcripts, as Table 3 shows.</Paragraph> <Section position="1" start_page="1365" end_page="1366" type="sub_section"> <SectionTitle> Types </SectionTitle> <Paragraph position="0"> The 94.1% acceptability rate for general news documents is not appreciably different from the 92% average that Brandow et al. (1995) reported.</Paragraph> <Paragraph position="1"> The results for lists and newsbriefs were not surprising. Such documents seldom have logical leads.</Paragraph> <Paragraph position="2"> Lists primarily consist of several like items, such as products and their prices, or companies and corresponding stock quotes. In rare instances, the BODY of a list type document includes a brief description of the contents of the list that Searchable LEAD can capture. In most cases, however, there is nothing meaningful for any technology to extract.</Paragraph> <Paragraph position="3"> Newsbrief documents usually consist of several often unrelated stories combined into one document.</Paragraph> <Paragraph position="4"> In some newsbrief documents, however, there is an introduction that Searchable LEAD can exploit.</Paragraph> <Paragraph position="5"> This was especially tree for newsbrief documents from wires (67.4% acceptability on 46 documents), but rarely tree for either magazines (13.8% accept- null ability on 109 documents) or newspapers (3.1% acceptability on 32 documents).</Paragraph> <Paragraph position="6"> LEADs for Wanscript type documents fared somewhat better, with source being a factor for these also. LEADs for transcripts from transcript sources were less likely to be rated acceptable (67.8% acceptability on 435 documents) than those from wires (90.0% acceptability on 40 documents) or newsletters (83.3% acceptability on 24 documents).</Paragraph> <Paragraph position="7"> Among general news documents, only LEADs for the review sub-type had a low acceptability rate, as</Paragraph> </Section> <Section position="2" start_page="1366" end_page="1366" type="sub_section"> <SectionTitle> Document Sub-types </SectionTitle> <Paragraph position="0"> The distribution of list, newsbrief and transcript type documents was often the cause of other apparent problem-indicating attributes. For example, the overall acceptability rate for LEADs for United States sources was 80.1% on 2,141 documents, whereas the overall acceptability rate for non-United States sources was 90.4% on 586 documents.</Paragraph> <Paragraph position="1"> When list, newsbrief and transcript documents were removed, the acceptability rate for United States sources was 94.5% on 1,391 documents, and the acceptability rate for non-United States sources was 93.0% on 560 documents.</Paragraph> <Paragraph position="2"> When examining other general news document attributes, we found that only LEADs for magazines had a somewhat lower acceptability rate (Fable 5).</Paragraph> <Paragraph position="3"> by Source Type The review sub-type was a factor here. Many of those were from magazines. Excluding those, the acceptability rate for magazine LEADs climbed to 92.50/0, still lower than for any other source.</Paragraph> <Paragraph position="4"> Document length was a factor for LEAD acceptability for the entire test corpus, but list, newsbrief and transcript type documents are typically longer than general news documents. Document length was less of a factor when looking only at LEADs for general news documents (Fable 6).</Paragraph> </Section> </Section> <Section position="8" start_page="1366" end_page="1366" type="metho"> <SectionTitle> BODY Length Number of Acceptability </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="1366" end_page="1366" type="sub_section"> <SectionTitle> by Document Length </SectionTitle> <Paragraph position="0"> The length of the LEAD itself was not tied to acceptability for either the entire test corpus or the general news document subset.</Paragraph> </Section> </Section> class="xml-element"></Paper>