XML Viewer - w06-0707

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-0707_metho.xml
Size: 5,992 bytes
Last Modified: 2025-10-06 14:10:37
<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0707">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics DUC 2005: Evaluation of Question-Focused Summarization Systems</Title>
  <Section position="4" start_page="48" end_page="49" type="metho">
    <SectionTitle>
2 Task Description
</SectionTitle>
    <Paragraph position="0"> The DUC 2005 task was a complex question-focused summarization task that required summarizers to piece together information from multiple documents to answer a question or set of questions as posed in a topic.</Paragraph>
    <Paragraph position="1"> Assessors developed a total of 50 topics to be used as test data. For each topic, the assessor selected 25-50 related documents from the Los Angeles Times and Financial Times of London and formulated a topic statement, which was a request for information that could be answered using the selected documents. The topic statement could be in the form of a question or set of related questions and could include background information that the assessor thought would help clarify his/her information need.</Paragraph>
    <Paragraph position="2"> The assessor also indicated the &amp;quot;granularity&amp;quot; of the desired response for each topic. That is, they indicated whether they wanted the answer to their question(s) to name specific events, people, places, etc., or whether they wanted a general, high-level answer. Only one value of granularity was given for each topic, since the goal was not to measure the effect of different granularities on system performance for a given topic, but to provide additional information about the user's preferences to both human and automatic summarizers.</Paragraph>
    <Paragraph position="3"> An example DUC topic follows: num: D345 title: American Tobacco Companies Overseas null narr: In the early 1990's, American tobacco companies tried to expand their business overseas. What did these companies do or try to do and where? How did their parent companies fare? granularity: specific The summarization task was the same for both human and automatic summarizers: Given a DUC topic with granularity specification and a set of documents relevant to the topic, the summarization task was to create from the documents a brief,  well-organized, fluent summary that answers the need for information expressed in the topic, at the specified level of granularity. The summary could be no longer than 250 words (whitespacedelimited tokens). Summaries over the size limit were truncated, and no bonus was given for creating a shorter summary. No specific formatting other than linear was allowed. The summary should include (in some form or other) all the information in the documents that contributed to meeting the information need.</Paragraph>
    <Paragraph position="4"> Ten assessors produced a total of 9 human summaries for each of 20 topics, and 4 human summaries for each of the remaining 30 topics. The summarization task was a relatively difficult task, requiring about 5 hours to manually create each summary. Thus, there would be a real benefit to users if the task could be performed automatically.</Paragraph>
  </Section>
  <Section position="5" start_page="49" end_page="49" type="metho">
    <SectionTitle>
3 Participants
</SectionTitle>
    <Paragraph position="0"> There was much interest in the longer, question-focused summaries required in the DUC 2005 task. 31 participants submitted runs to the evaluation; they are identified by numeric Run IDs (2-32) in the remainder of this paper. We also developed a simple baseline system that returned the first 250 words of the most recent document for each topic (Run ID = 1). In addition to the automatic peers, there were 10 human peers, assigned alphabetic Run IDs, A-J.</Paragraph>
    <Paragraph position="1"> Most system developers treated the summarization task as a passage retrieval task. Sentences were ranked according to relevance to the topic.</Paragraph>
    <Paragraph position="2"> The most relevant sentences were then selected for inclusion in the summary while minimizing redundancy within the summary, up to the maximum 250-word allowance. A significant minority of systems first decomposed the topic narrative into a set of simpler questions, and then extracted sentences to answer each subquestion. Systems differed in the approach taken to compute relevance and redundancy, using similarity metrics ranging from simple term frequency to semantic graph matching. In order to include more relevant information in the summary, systems attempted within-sentence compression by removing phrases such as parentheticals and relative clauses.</Paragraph>
    <Paragraph position="3"> Many systems simply ignored the granularity specification. The systems that addressed granularity did so by preferring to extract sentences that contained proper names for topics with a &amp;quot;specific&amp;quot; granularity but not for topics with &amp;quot;general&amp;quot; granularity.</Paragraph>
    <Paragraph position="4"> Cross-sentence dependencies had to be handled, including anaphora. Strategies for dealing with pronouns that occurred in relevant sentences included co-reference resolution, including the previous sentence for additional context, or simply excluding all sentences containing any pronouns.</Paragraph>
    <Paragraph position="5"> Most systems made no attempt to reword the extracted sentences to improve the readability of the final summary. Although some systems grouped related sentences together to improve cohesion, the most common heuristic to improve readability was simply to order the extracted sentences by document date and position in the document. System 12 achieved high readability scores by choosing a single representative document and extracting sentences in the order of appearance in that document. This approach is similar to the base-line summarizer and produces summaries that are more fluent than those constructed from multiple documents.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML