File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/w00-1106_evalu.xml
Size: 3,335 bytes
Last Modified: 2025-10-06 13:58:41
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1106"> <Title>Corpus-Based Learning of Compound Noun Indexing *</Title> <Section position="6" start_page="62" end_page="63" type="evalu"> <SectionTitle> 6 Experiment Results </SectionTitle> <Paragraph position="0"> To calculate the similarity between a document and a query, we use the p-norm retrieval model (Fox, 1983) and use 2.0 as the p-value.</Paragraph> <Paragraph position="1"> We also use fhe component nouns in a compound as the indexing terms. We follow the standard TREC evaluation schemes (Salton and Buckley, 1991). For single index terms, we use the weighting method atn.ntc (Lee, 1995).</Paragraph> <Section position="1" start_page="62" end_page="63" type="sub_section"> <SectionTitle> 6.1 Compound Noun Indexing Experiments </SectionTitle> <Paragraph position="0"> This experiment shows how well the proposed method can index diverse types of compound nouns than the previous popular methods which use human-generated compound noun indexing rules (Kim, 1994; Lee et al., 1997).</Paragraph> <Paragraph position="1"> For simplicity, we filtered the generated compound nouns using the mutual information of the compound noun elements with a threshold of zero (method A in Table 12).</Paragraph> <Paragraph position="2"> Table 13 shows that the terms indexed by previous linguistic approach are a subset of the ones made by our statistical approach.</Paragraph> <Paragraph position="3"> This means that the proposed method can cover more diverse compound nouns than the</Paragraph> <Paragraph position="5"> manual linguistic rule method. We perform a retrieval experiment to evaluate the automatically extracted rules. Table 144 and table 155 show that our method has slightly better recall and ll-point average precision than the manual linguistic rule method.</Paragraph> </Section> <Section position="2" start_page="63" end_page="63" type="sub_section"> <SectionTitle> 6.2 Retrieval Experiments Using Various Filtering Methods </SectionTitle> <Paragraph position="0"> In this experiment, we compare the seven filtering methods to find out which one is the best in terms of effectiveness and efficiency.</Paragraph> <Paragraph position="1"> For this experiment, we used our automatic rules for the compound noun indexing, and the test collection KTSET2.0. To check the effectiveness, we used recall and ll-point average precision. To check the efficiency, we used the number of index terms. Table 16 shows the results of the various filtering experiments. null From Table 16, the methods using mutual information reduce the number of index terms, whereas they have lower precision. The reason of this lower precision is that MI has a bias, i.e., scoring in favor of rare terms over common terms, so MI seems to have a problem in its sensitivity to probability estimation error (Yang and Pedersen, 1997).</Paragraph> <Paragraph position="2"> In this experiment 6, we see that method B generates the smallest number of compound nouns (best efficiency) and our final proposing method H has the best recall and precision * (effectiveness) with the reasonable number * of compound nouns (efficiency). We can conclude that the filtering method H is the best, considering the effectiveness and the efficiency at the same time.</Paragraph> </Section> </Section> class="xml-element"></Paper>