File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/w03-1113_evalu.xml

Size: 6,736 bytes

Last Modified: 2025-10-06 13:59:03

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-1113">
  <Title>Dynamic Programming Matching for Large Scale Information Retrieval</Title>
  <Section position="5" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
4 Experiment
</SectionTitle>
    <Paragraph position="0"> In the experiment, we compared the proposed FDP method with SIM1, SIM2, and SIM3, which were described in Section 2. We measured three values:  search effectiveness, memory usage, and execution time.</Paragraph>
    <Paragraph position="1"> We used the NTCIR1 collection (NTCIR Project, 1999). This collection consists of 83 retrieval topics and roughly 330,000 documents of Japanese technical abstracts. The 83 topics include 30 training topics (topic01-30); the rest are for testing (topic3183). The testing topics were more difficult than the training topics. Each topic contains five parts, &amp;quot;TI-TLE&amp;quot;, &amp;quot;DESCRIPTION&amp;quot;, &amp;quot;NARRATIVE&amp;quot;, &amp;quot;CON-CEPT&amp;quot;, and &amp;quot;FIELD.&amp;quot; We retrieved using &amp;quot;DE-SCRIPTION,&amp;quot; which is retrieval query and a short sentence.</Paragraph>
    <Paragraph position="2"> All the experiments reported in this section were conducted using a dual AMD Athlon MP 1900+ with 3GB of physical memory, running TurboLinux 7.0.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Search Effectiveness
</SectionTitle>
      <Paragraph position="0"> The proposed FDP method restricts the number of bigrams that can contribute to string matching. That is, only a small number of strings are considered. It was not clear whether FDP maintains its effectiveness like SIM3. To verify it, we compared the effectiveness of FDP with that of SIM1, SIM2, and SIM3.</Paragraph>
      <Paragraph position="1"> We also needed to know how the effectiveness might vary by the number of bigrams. We set number n at 5, 10, 15, 20, 30, 50, and 500. They were named FDP5, FDP10, FDP15, FDP20, FDP30, FDP50, and FDP500, respectively.</Paragraph>
      <Paragraph position="3"> The NTCIR1 collection also contains a relevance judgment. We obtained the 11-point average precision and R-precision using standard tools called TRECEVAL. And we tested about statistical significance for difference of MAP (Mean Average Precision) (Kishida et al., 2002).</Paragraph>
      <Paragraph position="4"> Tables 2 and 3 show the search effectiveness for all methods. We found that FDP20 is the most effective. Table 1 shows the results of one-sided t-test for difference of MAP -xi ! -yi, where -xi and -yi are MAP of i-th method in the first row and MAP of i-th method in the first column, respectively. The level of significance fi is 0:005 and the degree of freedom &amp;quot; is 83 ! 1. The Symbols &lt;&lt;;&lt;;= represent &amp;quot;much less than fi&amp;quot;, &amp;quot;less than fi, and &amp;quot;not less than fi&amp;quot;, respectively. We found that except for FDP5 and FDP10, the other FDPs are significantly more effective than SIM3 at a level of significance 0:005. In additional, this shows that FDP30, FDP50, and FDP500 are not significantly more effective than FDP20. These have demonstrated our proposed FDP  method maintains its effectiveness, even though the strings that contribute similarity are restricted to a small number of bigrams. Also, it is interesting that the FDP with 20 bigrams is significantly more effective than the one with many more bigrams.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Memory Usage
</SectionTitle>
      <Paragraph position="0"> The proposed method needs to record all the positions considered bigrams. A memory area is therefore required to hold position information; in the worst case, the memory size required is the product of the number of documents and the number of substrings in a query. This means the memory requirement could be very large. However, using FDP, we have found that the amount of memory requested is of a reasonable size.</Paragraph>
      <Paragraph position="1"> In other words, the size of the memory area is the total sum of collection frequency for all strings that contribute similarity. We examined the amount of memory used by comparison for the total sum of collection frequency.</Paragraph>
      <Paragraph position="2">  quency for three kinds of string sets. In the figure, AllNgram is for sets of all substrings considered by SIM3, AllBigram is for sets of all bigrams, and 20Bigram is for sets of 20 bigrams considered by FDP20. The field surrounded by the plot line and the horizontal axis represents the total sum of collection frequency. As the figure shows, AllBigram and 20Bigram occupy a much smaller field than AllNgram. This means the memory requirement of FDP is much smaller than that of SIM3. This result shows that FDP is possible to efficiently perform large-scale information retrieval on a computer with a reasonable amount of memory.</Paragraph>
      <Paragraph position="3"> Figure 3 shows enlarged graphs of AllBigram and 20Bigram from Figure 2. The figure shows that 20Bigram equals AllBigram for most queries, but not always. However, as shown in Table 2 and Table 3, FDP20 actually has the highest precision in all FDPs. This means that considering more bigrams is not necessarily an advantage. Probably, by choosing substrings with a high contribution, we manage to get rid of noisy strings.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Execution Time
</SectionTitle>
      <Paragraph position="0"> We measured execution time under the same conditions as described in Section 4.1. Notice we implemented SIM1, SIM2, and SIM3 in C language.</Paragraph>
      <Paragraph position="1"> On the other hand, FDP is implemented in Java (JDK1.3.1.04). When we noted the time required to make a suffix array, we found that FDP took 1.5 times as long as SIM in Figure 4. Thus, for the same algorithm, the execution speed of Java is generally slower than that of C language.</Paragraph>
      <Paragraph position="2"> Figures 5 and 6 show the time taken to retrieve for each topic01-30 and topic31-83. In the figures, the vertical axis is the number of documents, and the horizontal axis is the execution time. We found that all SIMs took much longer than FDPs. This demonstrates that our algorithm in Section 3 sharply improves execution speed. Moreover, we found that execution time did not increase exponentially even if the candidate documents for retrieval increased; instead, the retrieval collection becomes larger and larger. This suggests that FDP is an effective DP technique for large-scale information retrieval.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML