File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/w02-0813_intro.xml

Size: 1,993 bytes

Last Modified: 2025-10-06 14:01:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0813">
  <Title>Combining Contextual Features for Word Sense Disambiguation</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Highly ambiguous words pose continuing problems for Natural Language Processing (NLP) applications. They can lead to irrelevant document retrieval in IR systems, and inaccurate translations in Machine Translation systems (Palmer et al., 2000).</Paragraph>
    <Paragraph position="1"> While homonyms like bank are fairly tractable, polysemous words like run, with related but subtly distinct meanings, present the greatest hurdle for Word Sense Disambiguation (WSD). SENSEVAL-1 and SENSEVAL-2 have attempted to provide a framework for evaluating automatic systems by creating corpora tagged with fixed sense inventories, which also enables the training of supervised WSD systems. null In this paper we describe a maximum entropy WSD system that combines information from many different sources, using as much linguistic knowledge as can be gathered automatically by current NLP tools. Maximum entropy models have been applied to a wide range of classification tasks in NLP (Ratnaparkhi, 1998). Our maximum entropy system performed competitively with the best performing systems on the English verb lexical sample task in SENSEVAL-1 and SENSEVAL-2. We compared the system performance with human annotator performance in light of both fine-grained and coarse-grained sense distinctions made by WordNet in SENSEVAL-2, and found that many of the system's errors on fine-grained senses stemmed from the same sources that caused disagreements between human annotators. These differences were partially resolved by backing off to more coarse-grained sense-groups, which are sometimes necessary when even human annotators cannot make the fine-grained sense distinctions specified in the dictionary.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML