File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/c04-1002_evalu.xml

Size: 5,126 bytes

Last Modified: 2025-10-06 13:59:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1002">
  <Title>Linear-Time Dependency Analysis for Japanese</Title>
  <Section position="10" start_page="4" end_page="4" type="evalu">
    <SectionTitle>
6 Experimental Results and Discussion
</SectionTitle>
    <Paragraph position="0"> We implemented a parser and SVM tools in C++ and used them for experiments.</Paragraph>
    <Section position="1" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
6.1 Corpus
</SectionTitle>
      <Paragraph position="0"> We used the Kyoto University Corpus Version 2 (Kurohashi and Nagao, 1998) to evaluate the proposed algorithm. Our parser was trained on the articles on January 1st through 8th (7,958 sentences) and tested on the article on January 9th (1,246 sentences). The article on January 10th were used for development. The usage of these articles is the same as in (Uchimoto et al., 1999; Sekine et al., 2000; Kudo and Matsumoto, 2002).</Paragraph>
    </Section>
    <Section position="2" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
6.2 SVM setting
</SectionTitle>
      <Paragraph position="0"> Polynomial kernels with the degree of 3 are used and the misclassification cost is set to 1 unless stated otherwise.</Paragraph>
    </Section>
    <Section position="3" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
6.3 Results
</SectionTitle>
      <Paragraph position="0"> Accuracy. Performances of our parser on the test set is shown in Table 1. For comparison to pre- null vious work we use the standard measures for the Kyoto University Corpus: dependency accuracy and sentence accuracy. The dependency accuracy is the percentage of correct dependencies and the sentence accuracy is the percentage of sentences, all the dependencies in which are correctly analyzed.</Paragraph>
      <Paragraph position="1"> The accuracy with the standard feature set is relatively good. Actually, this accuracy is almost the same as that of the cascaded chunking model without dynamic features (Kudo and Matsumoto, 2002). Our parser with the full feature set yields an accuracy of 89.56%, which is the best in the previously published results.</Paragraph>
      <Paragraph position="2"> Asymptotic Time Complexity. Figure 4 shows the running time of our parser on the test set using a workstation (Ultra SPARC II 450 MHz with 1GB memory). It clearly shows that the running time is proportional to the sentence length and this observation is consistent with our theoretical analysis in Section 4.2.</Paragraph>
      <Paragraph position="3"> One might think that although the upper bound of time complexity is lower than those of previous work, actual processing of our parser is not so fast. Slowness of our parser is mainly due to a huge computation of kernel evaluations in SVMs. The SVM  nel. The misclassification cost is set to 0.0056. classifiers in our experiments have about forty thousand support vectors. Therefore, for every decision of dependency also a huge computation of dot products is required. Fortunately, solutions to this problem have already been given by Kudo and Matsumoto (2003). They proposed methods to convert a polynomial kernel with higher degrees to a simple linear kernel and reported a new classifier with the converted kernel was about 30 to 300 times faster than the original one while keeping the accuracy. By applying their methods to our parser, its processing time would be enough practical.</Paragraph>
      <Paragraph position="4"> In order to roughly estimate the improved speed of our parser, we built a parser with a linear kernel and ran it on the same test set. Figure 5 shows the observed time of the parser with a linear kernel using the same machine. The parser runs fast enough. It can parse a very long sentence within 0.02 seconds. Furthermore, accuracy as well as speed of this parser was much better than we expected. It achieves a dependency accuracy of 87.36% and a sentence accuracy of 40.60%. These accuracies are slightly better than those in (Uchimoto et al., 1999), where combinations of features are manually selected. null</Paragraph>
    </Section>
    <Section position="4" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
6.4 Comparison to Related Work
</SectionTitle>
      <Paragraph position="0"> We compare our parser to those in related work. A summary of the comparison is shown in Table 2.</Paragraph>
      <Paragraph position="1"> It clearly shows that our proposed algorithm with SVMs has a good property with regard to time complexity and in addition our parser successfully achieves a state-of-the-art accuracy.</Paragraph>
      <Paragraph position="2"> Theoretical comparison with (Kudo and Matsumoto, 2002) is described in Section 4.4. Uchimoto et al. (1999) used the backward beam search with ME. According to (Sekine et al., 2000), the analyzing time followed a quadratic curve. In contrast,  2000, USI99 = Uchimoto et al. 1999, and Seki00 = Sekine 2000. our parser analyzes a sentence in linear time keeping a better accuracy. Sekine (2000) also proposed a very fast parser that runs in linear time; however, accuracy is greatly sacrificed.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML