File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/p02-1059_concl.xml

Size: 3,063 bytes

Last Modified: 2025-10-06 13:53:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="P02-1059">
  <Title>Supervised Ranking in Open-Domain Text Summarization</Title>
  <Section position="9" start_page="0" end_page="0" type="concl">
    <SectionTitle>
8 Conclusion
</SectionTitle>
    <Paragraph position="0"> As a way of exploiting human biases towards an increased performance of the summarizer, we have explored approaches to embedding supervised learning within a general unsupervised framework. In the  paper, we focused on the use of decision tree as a plug-in learner. We have shown empirically that the idea works for a number of decision trees, including C4.5, MDL-DT and SSDT. Coupled with the learning component, the unsupervised summarizer based on clustering significantly improved its performance on the corpus of human created summaries. More importantly, we found that supervised learners perform better when coupled with the clustering than when working alone. We argued that that has to do with the high variation in human created summaries: the clustering component forces a decision tree to pay more attention to sentences marginally relevant to the main thread of the text.</Paragraph>
    <Paragraph position="1"> While ProbDTs appear to work well with ranking, it is also possible to take a different approach: for instance, we may use some distance metric in instead of probability to distinguish among sentences. It would be interesting to invoke the notion like prototype modeler (Kalton et al., 2001) and see how it might fare when used as a ranking model.</Paragraph>
    <Paragraph position="2"> Moreover, it may be worthwhile to explore some non-clustering approaches to representing the diversity of contents of a text, such as Gong and Liu (2001)'s summarizer 1 (GLS1, for short), where a sentence is selected on the basis of its similarity to the text it belongs to, but which excludes terms that appear in previously selected sentences. While our preliminary study indicates that GLS1 produces performance comparable and even superior to DBS on some tasks in the document retrieval domain, we have no results available at the moment on the efficacy of combining GLS1 and ProbDT on sentence extraction tasks.</Paragraph>
    <Paragraph position="3"> Finally, we note that the test corpus used for  on C4.5 with the MDL extension. DBS (=Z/V) denotes the diversity based summarizer. Z represents the Z-model summarizer. Performance figures are in F-measure. 'V' indicates that the relevant classifier is diversity-enabled. Note that DBS =Z/V.</Paragraph>
    <Paragraph position="4">  evaluation is somewhat artificial in the sense that we elicit judgments from people on the summaryworthiness of a particular sentence in the text. Perhaps, we should look at naturally occurring abstracts or extracts as a potential source for training/evaluation data for summarization research. Besides being natural, they usually come in large number, which may alleviate some concern about the lack of sufficient resources for training learning algorithms in summarization.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML