File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/c04-1109_evalu.xml

Size: 8,174 bytes

Last Modified: 2025-10-06 13:59:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1109">
  <Title>Discriminative Slot Detection Using Kernel Methods</Title>
  <Section position="8" start_page="21" end_page="21" type="evalu">
    <SectionTitle>
6 Experiments
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="21" end_page="21" type="sub_section">
      <SectionTitle>
6.1 Corpus
</SectionTitle>
      <Paragraph position="0"> The experiments of ARES were done on the MUC-6 corporate management succession domain using the official training data and, for the final experiment, the official test data as well. The training data was split into a training set (80%) and validation set (20%). In ARES, the text was preprocessed by the Proteus NE tagger and Charniak sentence parser. Then the GLARF processor produced dependency graphs based on the parse trees and NE results. All the names were transformed into symbols representing their types, such as #PERSON# for all person names. The reason is that we think the name itself does not provide a significant clue; the only thing that matters is what type of name occurs at certain position.</Paragraph>
      <Paragraph position="1"> Two tasks have been tried: one is EOD (event occurrence detection) on sentences; the other is SFD (slot filler detection) on named entities, including person names and job titles. EOD is to determine whether a sentence contains an event or not. This would give us general information about sentence-level event occurrences. SFD is to find name fillers for event slots. The slots we experimented with were the person name and job title slots in MUC-6. We used the SVM package SVMlight in our experiments, embedding our own kernels as custom kernels.</Paragraph>
    </Section>
    <Section position="2" start_page="21" end_page="21" type="sub_section">
      <SectionTitle>
6.2 EOD Experiments
</SectionTitle>
      <Paragraph position="0"> In this experiment, ARES was trained on the official MUC-6 training data to do event occurrence detection. The data contains 1940 sentences, of which 158 are labeled as positive instances (contain an event). Five-fold cross validation was used so that the training and test set contain 80% and 20% of the data respectively.</Paragraph>
      <Paragraph position="1"> Three kernels defined in the previous section were tried. Table 1 shows the performance of each kernel. Three n-gram kernels were tested: unigram, bigram and trigram. Subsequences longer than trigrams were also tried, but did not yield better results.</Paragraph>
      <Paragraph position="2"> The results show that the trigram kernel performed the best among n-gram kernels. GLARF kernel did better than n-gram kernels, which is reasonable because it incorporates detailed syntax of a sentence. But generally speaking, the n-gram kernels alone performed fairly well for this task, which indicates that low level text processing can also provide useful information. The mix kernel that combines the trigram kernel with GLARF kernel gave the best performance, which might indicate that the low level information provides additional clues or helps to overcome errors in deep processing.</Paragraph>
      <Paragraph position="3">  different kernels. The Mix kernel is a linear combination of the trigram kernel and the Glarf kernel.</Paragraph>
    </Section>
    <Section position="3" start_page="21" end_page="21" type="sub_section">
      <SectionTitle>
6.3 SFD Experiments
</SectionTitle>
      <Paragraph position="0"> The slot filler detection (SFD) task is to find the named entities in text that can fill the corresponding slots of an event.2 We treat job title as a named entity throughout this paper, although it is not included in the traditional MUC named entity set. The slots we used for evaluation were PERSON_IN (the person who took a position), PERSON_OUT (the person who left a position) and POST (the position involved). We generated the two person slots from the official MUC-6 templates and the corresponding filler strings in text were labeled. Three SVM predictors were trained to find name fillers of each slot. Two experiments have been tried on MUC-6 training data using five-fold cross validation.</Paragraph>
      <Paragraph position="1"> The first experiment of ARES used slot kernel ),(1 jiSFD GGph alone, relying solely on local 2 We used this task for evaluation, rather than the official MUC template-filling task, in order to assess the system's ability to identify slot fillers separately from its ability to combine them into templates.</Paragraph>
      <Paragraph position="2">  context around a NE. From the performance table (Table 2), we can see that local context can give a fairly good clue for finding PERSON_IN and POST, but not for PERSON_OUT. The main reason is that local context might be not enough to determine a PERSON_OUT filler. It often requires inference or other semantic information. For example, the sentence &amp;quot;Aaron Spelling, the company's vice president, was named president.&amp;quot;, indicates that &amp;quot;Aaron Spelling&amp;quot; left the position of vice president, therefore it should be a PERSON_OUT. But the sentence &amp;quot;Aaron Spelling, the company's vice president, said ...&amp;quot;, which is very similar to first one in syntax, has no such indication at all. In complicated cases, a person can even hold two positions at the same time.</Paragraph>
      <Paragraph position="3">  In this experiment, the SVM predictor considered all the names identified by the NE tagger; however, most of the sentences do not contain an event occurrence at all, so NEs in these sentences should be ignored no matter what their local context is. To achieve this we need general information about event occurrence, and this is just what the EOD kernel can provide. In our second experiment, we tested the kernel ),(2 jiSFD SSph , which is a linear combination of the trigram EOD kernel and the SFD kernel ),(1 jiSFD GGph . Table 3 shows the performance of the combination kernel, from which we can see that there is clear performance improvement for all three slots. We also tried to use the mix kernel which gave us the best EOD performance, but it did not yield a better result. The reason we think is that the GLARF EOD kernel and SFD kernel are from the same syntactic source, so the information was repeated.  with trigram EOD kernel. For PER_OUT, unigram EOD kernel was used.</Paragraph>
      <Paragraph position="4"> Since five-fold cross validation was used, ARES was trained on 80% of the MUC-6 training data in these two experiments.</Paragraph>
    </Section>
    <Section position="4" start_page="21" end_page="21" type="sub_section">
      <SectionTitle>
6.4 Comparison with MUC-6 System
</SectionTitle>
      <Paragraph position="0"> This experiment was done on the official MUC-6 training and test data, which contain 50K words and 40K words respectively. ARES used the official corpora as training and test sets, except that in the training data, all the slot fillers were manually labeled. We compared the performance of ARES with the NYU Proteus system, a rule-based system that performed well for MUC-6. To score the performance for these three slots, we generated the slot-filler pairs as keys for a document from the official MUC-6 templates and removed duplicate pairs. The scorer matches the filler string in the response file of ARES to the keys. The response result for Proteus was extracted in the same way from its template output.</Paragraph>
      <Paragraph position="1"> Table 4. shows the result of ARES using the combination kernel in the previous experiment.</Paragraph>
      <Paragraph position="2">  ),(2 jiSFD SSph on MUC-6 test data.</Paragraph>
      <Paragraph position="3"> Table 5 shows the test result of the Proteus system. Comparing the numbers we can see that for slot PERSON_IN and POST, ARES outperformed the Proteus system by a few points. The result is promising considering that this model is fully automatic and does not involve any postprocessing. As for the PERSON_OUT slot, the performance of ARES was not as good. As we have discussed before, relying purely on syntax might not help us much; we may need an inference model to resolve this problem.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML