File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/02/w02-0813_relat.xml

Size: 2,416 bytes

Last Modified: 2025-10-06 14:15:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0813">
  <Title>Combining Contextual Features for Word Sense Disambiguation</Title>
  <Section position="3" start_page="0" end_page="0" type="relat">
    <SectionTitle>
2 Related Work
</SectionTitle>
    <Paragraph position="0"> While it is possible to build an automatic sense tagger using only the dictionary definitions, the most accurate systems tend to take advantage of supervised learning. The system with the highest overall performance in SENSEVAL-1 used Yarowsky's hierarchical decision lists (Yarowsky, 2000); while there is a large set of potential features, only a small number is actually used to determine the sense of any given instance of a word. Chodorow, Leacock and Miller (Chodorow et al., 2000) also achieved high accuracy using naive bayesian models for WSD, combining sets of linguistically impoverished features that were classified as either topical or local. Topical features consisted of a bag of open-class words in a wide window covering the entire context provided; local features were words and parts of speech within a small window or at particular offsets July 2002, pp. 88-94. Association for Computational Linguistics. Disambiguation: Recent Successes and Future Directions, Philadelphia, Proceedings of the SIGLEX/SENSEVAL Workshop on Word Sense from the target word. The system was configured to use only local, only topical, or both local and topical features for each word, depending on which configuration produced the best result on a held-out portion of the training data.</Paragraph>
    <Paragraph position="1"> Previous experiments (Ng and Lee, 1996) have explored the relative contribution of different knowledge sources to WSD and have concluded that collocational information is more important than syntactic information. Additionally, Pedersen (Pedersen, 2001; Pedersen, 2000) has pursued the approach of using simple word bigrams and other linguistically impoverished feature sets for sense tagging, to establish upper bounds on the accuracy of feature sets that do not impose substantial pre-processing requirements. In contrast, we wish to demonstrate that such pre-processing significantly improves accuracy for sense-tagging English verbs, because we believe that they allow us to extract a set of features that more closely parallels the information humans use for sense disambiguation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML