File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/06/p06-1047_relat.xml
Size: 6,015 bytes
Last Modified: 2025-10-06 14:15:50
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1047"> <Title>Extractive Summarization using Interand Intra- Event Relevance</Title> <Section position="5" start_page="371" end_page="400" type="relat"> <SectionTitle> 4. Experiments and Discussions </SectionTitle> <Paragraph position="0"> To evaluate the event based summarization approaches proposed, we conduct a set of experiments on 30 English document sets provide by the DUC 2001 multi-document summarization task. The documents are pre-processed with GATE to recognize the previously mentioned four types of name entities. On average, each set contains 10.3 documents, 602 sentences, 216 event terms and 148.5 name entities.</Paragraph> <Paragraph position="1"> To evaluate the quality of the generated summaries, we choose an automatic summary evaluation metric ROUGE, which has been used in DUCs. ROUGE is a recall-based metric for fixed length summaries. It bases on N-gram co-occurrence and compares the system generated summaries to human judges (Lin and Hovy, 2003). For each DUC document set, the system creates a summary of 200 word length and present three of the ROUGE metrics: ROUGE-1 (unigram-based), ROUGE-2 (bigram-based), and ROUGE-W (based on longest common sub-sequence weighed by the length) in the following experiments and evaluations.</Paragraph> <Paragraph position="2"> We first evaluate the summaries generated based on ),( NEETR itself. In the pre-evaluation experiments, we have observed that some fre<Person>, a-position-name of <Organization>, does something.</Paragraph> <Paragraph position="3"> <Person> and another <Person> do something.</Paragraph> <Paragraph position="4"> quently occurring nouns, such as &quot;doctors&quot; and &quot;hospitals&quot;, by themselves are not marked by general NE taggers. But they indicate persons, organizations or locations. We compare the ROUGE scores of adding frequent nouns or not to the set of named entities in Table 3. A noun is considered as a frequent noun when its frequency is larger than 10. Roughly 5% improvement is achieved when high frequent nouns are taken into the consideration. Hereafter, when we mention NE in latter experiments, the high frequent nouns are included.</Paragraph> <Paragraph position="5"> tion results by using ),( ETETR itself. It compares two relevance derivation approaches,</Paragraph> <Section position="1" start_page="373" end_page="400" type="sub_section"> <SectionTitle> WordNet R and Document </SectionTitle> <Paragraph position="0"> R . The topic-specific relevance derived from the documents to be summarized outperforms the general purpose Word-Net relevance by about 4%. This result is reasonable as WordNet may introduce the word relatedness which is not necessary in the topic-specific documents. When we examine the relevance matrix from the event term pairs with the highest relevant, we find that the pairs, like &quot;abort&quot; and &quot;confirm&quot;, &quot;vote&quot; and confirm&quot;, do reflect semantics (antonymous) and associated (causal) relations to some degree.</Paragraph> <Paragraph position="1"> ),( NENE in Table 5. Looking more closely, we conclude that compared to event terms, named entities are more representative of the documents in which they are included. In other words, event terms are more likely to be distributed around all the document sets, whereas named entities are more topic-specific and therefore cluster in a particular document set more. Examples of high related named entities in relevance matrix are &quot;Andrew&quot; and &quot;Florida&quot;, &quot;Louisiana&quot; and &quot;Florida&quot;. Although their relevance is not as explicit as the same of event terms (their relevance is more contextual than semantic), we can still deduce that some events may happen in both Louisiana and Florida, or about Andrew in Florida. In addition, it also shows that the relevance we would have expected to be derived from patterns and clustering can also be discovered by ),( NENER Next, we evaluate the integration of ),( NEETR , ),( ETETR and ),( NENER . As DUC 2001 provides 4 different summary sizes for evaluation, it satisfies our desire to test the sensibility of the proposed event-based summarization techniques to the length of summaries. While the previously presented results are evaluated on 200 word summaries, now we move to check the results in four different sizes, i.e. 50, 100, 200 and 400 words. The experiments results show that the event-based approaches indeed prefer longer summaries. This is coincident with what we have hypothesized.</Paragraph> <Paragraph position="2"> For this set of experiments, we choose to integrate the best method from each individual evaluation presented previously. It appears that using the named entities relevance which is derived from the event terms gives the best ROUGE scores in almost all the summery sizes.</Paragraph> <Paragraph position="3"> Compared with the results provided in (Filatova and Hatzivassiloglou, 2004) whose average ROUGE-1 score is below 0.3 on the same data set, the significant improvement is revealed. Of course, we need to test on more data in the future. null and with different summary lengths As discussed in Section 3.2, the named entities in the same cluster may often be relevant but not always be co-referred. In the following last set of experiments, we evaluate the two ways to use the clustering results. One is to consider them as related as if they are in the same cluster and derive the NE-NE relevance with (E5). The other is to merge the entities in one cluster as one reprehensive named entity and then use it in ET-NE with (E1). The rationality of the former approach is validated.</Paragraph> <Paragraph position="4"> Clustering is used to derive NE-NE Clustering is used to merge entities and then to derive ET-NE use the clustering information</Paragraph> </Section> </Section> class="xml-element"></Paper>