File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/01/n01-1025_evalu.xml

Size: 9,878 bytes

Last Modified: 2025-10-06 13:58:44

<?xml version="1.0" standalone="yes"?>
<Paper uid="N01-1025">
  <Title>Chunking with Support Vector Machines</Title>
  <Section position="8" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
4 Experiments
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Experiment Setting
</SectionTitle>
      <Paragraph position="0"> We use the following three annotated corpora for our experiments.</Paragraph>
      <Paragraph position="1"> a176 Base NP standard data set (baseNP-S) This data set was first introduced by (Ramshaw and Marcus, 1995), and taken as the standard data set for baseNP identification task2. This data set consists of four sections (15-18) of the Wall Street Journal (WSJ) part of the Penn Treebank for the training data, and one section (20) for the test data. The data has part-of-speech (POS) tags annotated by the Brill tagger(Brill, 1995).</Paragraph>
      <Paragraph position="2"> a176 Base NP large data set (baseNP-L) This data set consists of 20 sections (02-21) of the WSJ part of the Penn Treebank for the training data, and one section (00) for the test data. POS tags in this data sets are also annotated by the Brill tagger. We omit the experiments IOB1 and IOE1 representations for this training data since the data size is too large for our current SVMs learning program. In case of IOB1 and IOE1, the size of training data for one classifier which estimates the class I and O becomes much larger compared with IOB2 and IOE2 models. In addition, we also omit to estimate the voting weights using cross validation method due to a large amount of training cost.</Paragraph>
      <Paragraph position="3"> a176 Chunking data set (chunking) This data set was used for CoNLL-2000 shared task(Tjong Kim Sang and Buchholz, 2000). In this data set, the total of 10 base phrase classes (NP,VP,PP,ADJP,ADVP,CONJP, 2ftp://ftp.cis.upenn.edu/pub/chunker/ INITJ,LST,PTR,SBAR) are annotated. This data set consists of 4 sections (15-18) of the WSJ part of the Penn Treebank for the training data, and one section (20) for the test data 3.</Paragraph>
      <Paragraph position="4"> All the experiments are carried out with our software package TinySVM4, which is designed and optimized to handle large sparse feature vectors and large number of training samples. This package can estimate the VC bound and Leave-One-Out bound automatically. For the kernel function, we use the 2-nd polynomial function and set the soft margin parameter a111 to be 1.</Paragraph>
      <Paragraph position="5"> In the baseNP identification task, the performance of the systems is usually measured with three rates: precision, recall and a177a23a178a68a179 a5 a2a106a51a47a69a66a46</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Results of Experiments
</SectionTitle>
      <Paragraph position="0"> Table 2 shows results of our SVMs based chunking with individual chunk representations. This table also lists the voting weights estimated by different approaches (B:Cross Validation, C:VC-bound, D:Leave-one-out). We also show the results of Start/End representation in Table 2.</Paragraph>
      <Paragraph position="1"> Table 3 shows the results of the weighted voting of four different voting methods: A: Uniform,</Paragraph>
      <Paragraph position="3"> Leave-One-Out Bound.</Paragraph>
      <Paragraph position="4"> Table 4 shows the precision, recall and a177 a178a68a179 a5 of the best result for each data set.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Accuracy vs Chunk Representation
</SectionTitle>
      <Paragraph position="0"> We obtain the best accuracy when we apply IOE2-B representation for baseNP-S and chunking data set. In fact, we cannot find a significant difference in the performance between Inside/Outside(IOB1/IOB2/IOE1/IOE2) and Start/End(IOBES) representations.</Paragraph>
      <Paragraph position="1"> Sassano and Utsuro evaluate how the difference of the chunk representation would affect the performance of the systems based on different machine learning algorithms(Sassano and Utsuro, 2000).</Paragraph>
      <Paragraph position="2"> They report that Decision List system performs better with Start/End representation than with Inside/Outside, since Decision List considers the specific combination of features. As for Maximum Entropy, they report that it performs better with Inside/Outside representation than with Start/End,  since Maximum Entropy model regards all features as independent and tries to catch the more general feature sets.</Paragraph>
      <Paragraph position="3"> We believe that SVMs perform well regardless of the chunk representation, since SVMs have a high generalization performance and a potential to select the optimal features for the given task.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.4 Effects of Weighted Voting
</SectionTitle>
      <Paragraph position="0"> By applying weighted voting, we achieve higher accuracy than any of single representation system regardless of the voting weights. Furthermore, we achieve higher accuracy by applying Cross validation and VC-bound and Leave-One-Out methods than the baseline method.</Paragraph>
      <Paragraph position="1"> By using VC bound for each weight, we achieve nearly the same accuracy as that of Cross validation. This result suggests that the VC bound has a potential to predict the error rate for the &amp;quot;true&amp;quot; test data accurately. Focusing on the relationship between the accuracy of the test data and the estimated weights, we find that VC bound can predict the accuracy for the test data precisely. Even if we have no room for applying the voting schemes because of some real-world constraints (limited computation and memory capacity), the use of VC bound may allow to obtain the best accuracy. On the other hand, we find that the prediction ability of Leave-One-Out is worse than that of VC bound.</Paragraph>
      <Paragraph position="2"> Cross validation is the standard method to estimate the voting weights for different systems. However, Cross validation requires a larger amount of computational overhead as the training data is divided and is repeatedly used to obtain the voting weights. We believe that VC bound is more effective than Cross validation, since it can obtain the comparable results to Cross validation without increasing computational overhead.</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.5 Comparison with Related Works
</SectionTitle>
      <Paragraph position="0"> Tjong Kim Sang et al. report that they achieve accuracy of 93.86 for baseNP-S data set, and 94.90 for baseNP-L data set. They apply weighted voting of the systems which are trained using distinct chunk representations and different machine learning algorithms such as MBL, ME and IGTree(Tjong Kim Sang, 2000a; Tjong Kim Sang et al., 2000).</Paragraph>
      <Paragraph position="1"> Our experiments achieve the accuracy of 93.76 94.11 for baseNP-S, and 95.29 - 95.34 for baseNP-L even with a single chunk representation. In addition, by applying the weighted voting framework, we achieve accuracy of 94.22 for baseNP-S, and 95.77 for baseNP-L data set. As far as accuracies are concerned, our model outperforms Tjong Kim Sang's model.</Paragraph>
      <Paragraph position="2"> In the CoNLL-2000 shared task, we achieved the accuracy of 93.48 using IOB2-F representation (Kudo and Matsumoto, 2000b) 5. By combining weighted voting schemes, we achieve accuracy of 93.91. In addition, our method also outperforms other methods based on the weighted voting(van Halteren, 2000; Tjong Kim Sang, 2000b).</Paragraph>
    </Section>
    <Section position="6" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.6 Future Work
</SectionTitle>
      <Paragraph position="0"> a176 Applying to other chunking tasks Our chunking method can be equally applicable to other chunking task, such as English POS tagging, Japanese chunk(bunsetsu) identification and named entity extraction. For future, we will apply our method to those chunking tasks and examine the performance of the method.</Paragraph>
      <Paragraph position="1"> a176 Incorporating variable context length model In our experiments, we simply use the so-called fixed context length model. We believe that we can achieve higher accuracy by selecting appropriate context length which is actually needed for identifying individual chunk tags. Sassano and Utsuro(Sassano and Utsuro, 2000) introduce a variable context length model for Japanese named entity identification task and perform better results. We will incorporate the variable context length model into our system.</Paragraph>
      <Paragraph position="2"> a176 Considering more predictable bound In our experiments, we introduce new types of voting methods which stem from the theorems of SVMs -- VC bound and Leave-One-Out bound. On the other hand, Chapelle and Vapnik introduce an alternative and more predictable bound for the risk and report their proposed bound is quite useful for selecting the kernel function and soft margin parameter(Chapelle and Vapnik, 2000). We believe that we can obtain higher accuracy using this more predictable bound for the voting weights in our experiments.</Paragraph>
      <Paragraph position="3"> 5In our experiments, the accuracy of 93.46 is obtained with IOB2-F representation, which was the exactly the same representation we applied for CoNLL 2000 shared task. This slight difference of accuracy arises from the following two reason :  (1) The difference of beam width for parsing (N=1 vs. N=5), (2) The difference of applied SVMs package (TinySVM vs. a186a96a187a189a188a56a190a192a191a194a193a120a195a102a196 .</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML