XML Viewer - w04-0833

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-0833_evalu.xml
Size: 3,944 bytes
Last Modified: 2025-10-06 13:59:15
<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0833">
  <Title>Simple Features for Statistical Word Sense Disambiguation</Title>
  <Section position="6" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
5 Results and Discussion
</SectionTitle>
    <Paragraph position="0"> The Word Sense Disambiguator program has been written as a Processing Resource in the Gate Architecture5. It uses the ANNIE Tokenizer and POS Tagger which are provided as components of Gate.</Paragraph>
    <Paragraph position="1"> Table 2 shows the results of both systems for each category of words. It can be seen that approximate syntactic information has performed relatively better with adjectives which are generally harder to disambiguate.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 Window size and the commonest
</SectionTitle>
      <Paragraph position="0"> e ect The optimal window size seems to be related to the distribution of the senses in the training samples and the number of training samples available for a word. Indeed, a large window size is selected when the number of samples is large, and the samples are not evenly distributed among senses. Basically because the words in Senseval are not mostly topical words, Na ve Bayes is working strongly with the commonest e ect. On the other hand, when a small window size is selected, the commonest e ect mostly vanishes and instead, collocations are relied upon.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 Distribution of samples
</SectionTitle>
      <Paragraph position="0"> A Na ve Bayes method is quite sensitive to the proportion of training and test samples: if the commonest class presented as test is di erent from the commonest class in training for example, this method performs poorly. This is a serious problem of Na ve Bayes towards real world WSD. For testing this claim, we made the following hypothesis: When the mean of absolute di erence of the test samples and training samples among classes of senses is more than 4%, Na ve Bayes method performs at most 20% above baseline6. Table 3 shows that this hypoth- null tribution of samples (Acc=Accuracy amount higher than baseline; Dist=Mean of distribution change.) out of 50 ambiguous words that satisfy the conditions). Furthermore, such words are not necessarily di cult words. Our Maximum Entropy method performed on average 25% above the baseline on 7 of them (ask.v, decide.v, di erent.a, di culty.n, sort.n, use.v, wash.v some of which are shown in Table 1).</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.3 Rare samples
</SectionTitle>
      <Paragraph position="0"> Na ve Bayes mostly ignores the senses with a few samples in the training and gets its score on the senses with large number of training instances, while Maximum Entropy exploits features from senses which have had a few training samples.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.4 Using lemmas and synsets
</SectionTitle>
      <Paragraph position="0"> We tried working with word lemmas instead of derivated forms; however, for some words it causes loss in accuracy. For example, for the adjective di erent.a, with window and sub-window size of 175 and 0, it reduces the accuracy from 60% to 46% with the validation set.</Paragraph>
      <Paragraph position="1"> However, for the noun sort.n, precision increases from 62% to 72% with a window size of 650 and sub-window size of 50. We believe that some senses come with a speci c form of their neighboring tokens and lemmatization removes this distinguishing feature.</Paragraph>
      <Paragraph position="2"> We also tried storing synsets of words as features for the Na ve Bayes learner, but obtained no signi cant change in the results.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML