File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-0405_concl.xml

Size: 2,283 bytes

Last Modified: 2025-10-06 13:53:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0405">
  <Title>Using Summaries in Document Retrieval</Title>
  <Section position="10" start_page="0" end_page="0" type="concl">
    <SectionTitle>
7 Discussion
</SectionTitle>
    <Paragraph position="0"> As Table 2 and answer set size statistics show, targeted customers benefit when limiting their queries to the Searchable LEAD. As Table 3 suggests, non-targeted customers such as those who want documents with any references to the topics clearly should not use Searchable LEAD.</Paragraph>
    <Paragraph position="1"> Most information retrieval system evaluations do not take differing customer perspectives into account. If we were to combine the results of our all reference and highly relevant reference evaluation scopes into one general evaluation pool as is done in Table 4, we would still note a significant improvement in precision. However, based on the falling f-measure, some might conclude that summaries are simply another failed attempt to improve information retrieval with the help of natural language processing.</Paragraph>
    <Paragraph position="2"> For the average customer overall, that is probably a fair conclusion. However, for the customer segments that prefer to retrieve a few good highly relevant documents as they start an information seeking task, Searchable LEAD helped to produce smaller answer sets that were more focused on the highly relevant documents that those customers target.</Paragraph>
    <Paragraph position="3">  both the highly relevant reference and all reference evaluation scopes.</Paragraph>
    <Paragraph position="4"> As for other retrieval tasks, when creating a document categorization system, we did gain some benefits when weighting terms found in headlines and leading text in news documents a bit higher (Wasson, 2000), but that effect is limited to news data. A colleague investigating an internal tf-idf-based search engine found no benefits to putting extra emphasis on terms found in the first paragraph of news articles, but that was a rather limited test. Neither of these were evaluated from multiple user perspectives, although in the case of Wasson (2000) the original project goal was to identify and categorize only highly relevant documents</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML