File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/w03-1116_evalu.xml
Size: 6,260 bytes
Last Modified: 2025-10-06 13:59:03
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1116"> <Title>Extraction of User Preferences from a Few Positive Documents</Title> <Section position="5" start_page="222" end_page="222" type="evalu"> <SectionTitle> 4 Experiments </SectionTitle> <Paragraph position="0"> We used Reuters-21578 data as an experimental document set. This collection has five different sets of contents related categories. They are EXCHANGES, ORGS, PEOPLE, PLACES and TOPICS. Some of the categories set have up to 265 categories, but some of them have just 39 categories. We chose the TOPICS categories set which has 135 categories. We divided the documents according to the &quot;ModeApte&quot; split. There are 9603 training documents and 3299 test documents.</Paragraph> <Paragraph position="1"> Among the 135 categories, we first chose only 90 ones that have at least one training example and one testing example. Then, we finally selected 21 categories that have from 10 to 30 training documents. The 3019 documents of those categories are used as testing documents. The document frequency information from 7770 training documents in 90 categories is used to calculate IDF values of terms. We did not consider negative documents under the assumption that only positive documents coincident with users' preferences were given implicitly or explicitly .</Paragraph> <Paragraph position="2"> Documents are ranked by the cosine similarity and the following F-measure (Baeza-Yates and Ribeiro-Neto, 1999), which is a weighted combination of recall and precision and popularly used for performance evaluation. Since the maximum value for F can be interpreted as the best possible compromise between recall and precision, we use this maximum value.</Paragraph> <Paragraph position="4"> are the recall and precision for the j'th document in the ranking and F j is their harmonic mean.</Paragraph> <Paragraph position="5"> First, our method was compared to the Rocchio and Widrow-Hoff algorithms. To see the effect of the number of FRKs, we made experiments by varying it from 5 to 30 in increment 5 and for the case that all terms are used. Table 2 shows the overall or summary result of the proposed method compared to the two existing algorithms for 21categories. The result shows that our method is better than the others in all cases, especially when 10 terms are used to represent user preferences. Table 3 shows the detail result in that case, i.e. the F-values and the performance improvement ratios when 10 terms are used. The proposed method has achieved about 20% over Rocchio algorithm and 10% over Widrow-Hoff algorithm on the average. When 5 terms are used to represent user preferences, 19 categories among 21 categories are used because &quot;strategic-metal&quot; and &quot;pet-chem&quot; categories do not satisfy the constraint in Section 3.2, i.e., 5 terms are too few to cover all training documents. It is not clear which component of our method mainly contributes to such improvement since our method consists of two main components - one is for extracting IRKs, the other for expanding and reweighting of IRKs. To analyze our method, we made several variants of the proposed method and did experiments with them. The variants are named by the sequence of the following symbols.</Paragraph> <Paragraph position="6"> IF, IR, IW: mean that IRKs are selected based on the weight obtained by the method in Section 3.1, the Rocchio algorithm, and the Widrow-Hoff algorithm, respectively.</Paragraph> <Paragraph position="7"> RC, RR, RW: mean that terms are reweighted by the method in Section 3.3, the Rocchio algorithm, and the Widrow-Hoff algorithm, respectively.</Paragraph> <Paragraph position="8"> EC, EF, ER, EW: mean that expanded terms are selected based on the weight obtained by applying the method in Section 3.3, the method in Section 3.1, the Rocchio algorithm, and the Widrow-Hoff algorithm, respectively.</Paragraph> <Paragraph position="9"> For example, the proposed method in Section 3 is named as IF_EF_RC, which means IRKs, and expanded terms are selected based on the weight calculated by the method in Section 3.1 and then reweighted by the method in Section 3.3. For another example, the method called by IF_RC_EC means that IRKs are selected based on the weight obtained by the method in Section 3.1 and then all terms are reweighed by the method in Section 3.3 before expanded terms are selected.</Paragraph> <Paragraph position="10"> In the proposed method, fuzzy inference technique is used to extract IRKs. So, we tried two variants, IR_ER_RC and IW_EW_RC, where the Rocchio and Widrow-Hoff algorithms are used respectively to calculate the representativeness (or weights) of terms instead of the method in Section 3.1, and then IRKs and expanded terms are selected based on these weights. The variants all use the reweighting scheme in Section 3.3. Table 4 shows that other keyword extraction algorithms do not show any benefit over the fuzzy inference approach. We can also observe that when one of the existing algorithms is combined with the second component of our method, the performance improvement over the case that the algorithm solely is used is negligible.</Paragraph> <Paragraph position="11"> The method to extract IRKs reflecting user's preference directly affects the result of the term reweighting process because the process is based on the term co-occurrence similarity with the IRKs. If the terms that are far from user's preference are extracted as IRKs, then some terms that actually are improper in representing user's information needs may be assigned with high weights during the reweighting process and then the final vector generated from the results may be disqualified from representing user's preferences. So, we can know that our fuzzy inference technique is effective to extract IRKs from the results in Table 4. To demonstrate the usefulness of the second part of our method, i.e., the expansion and re-weighting technique, we also tried the 5 variants of our method (IF_RC_EC, IF_RR_ER, IF_RW_EW, IF_EF_RR, IF_EF_RW). Table 5 shows the all variants are not better than the original though they outperform Rocchio and Widrow-Hoff algorithms.</Paragraph> </Section> class="xml-element"></Paper>