XML Viewer - p06-1013

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1013_intro.xml
Size: 4,224 bytes
Last Modified: 2025-10-06 14:03:29
<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1013">
  <Title>Ensemble Methods for Unsupervised WSD</Title>
  <Section position="3" start_page="0" end_page="97" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Word sense disambiguation (WSD), the task of identifying the intended meanings (senses) of words in context, holds promise for many NLP applications requiring broad-coverage language understanding. Examples include summarization, question answering, and text simpli cation. Recent studies have also shown that WSD can bene t machine translation (Vickrey et al., 2005) and information retrieval (Stokoe, 2005).</Paragraph>
    <Paragraph position="1"> Given the potential of WSD for many NLP tasks, much work has focused on the computational treatment of sense ambiguity, primarily using data-driven methods. Most accurate WSD systems to date are supervised and rely on the availability of training data, i.e., corpus occurrences of ambiguous words marked up with labels indicating the appropriate sense given the context (see Mihalcea and Edmonds 2004 and the references therein). A classi er automatically learns disambiguation cues from these hand-labeled examples. Although supervised methods typically achieve better performance than unsupervised alternatives, their applicability is limited to those words for which sense labeled data exists, and their accuracy is strongly correlated with the amount of labeled data available (Yarowsky and Florian, 2002).</Paragraph>
    <Paragraph position="2"> Furthermore, obtaining manually labeled corpora with word senses is costly and the task must be repeated for new domains, languages, or sense inventories. Ng (1997) estimates that a high accuracy domain independent system for WSD would probably need a corpus of about 3.2 million sense tagged words. At a throughput of one word per minute (Edmonds, 2000), this would require about 27 person-years of human annotation effort.</Paragraph>
    <Paragraph position="3"> This paper focuses on unsupervised methods which we argue are useful for broad coverage sense disambiguation. Unsupervised WSD algorithms fall into two general classes: those that perform token-based WSD by exploiting the similarity or relatedness between an ambiguous word and its context (e.g., Lesk 1986); and those that perform type-based WSD, simply by assigning all instances of an ambiguous word its most frequent (i.e., predominant) sense (e.g., McCarthy et al. 2004; Galley and McKeown 2003). The predominant senses are automatically acquired from raw text without recourse to manually annotated data. The motivation for assigning all instances of a word to its most prevalent sense stems from the observation that current supervised approaches rarely outperform the simple heuristic of choosing the most common sense in the training data, despite taking local context into account (Hoste et al., 2002). Furthermore, the approach allows sense inventories to be tailored to speci c domains. null The work presented here evaluates and compares the performance of well-established unsupervised WSD algorithms. We show that these algorithms yield suf ciently diverse outputs, thus motivating the use of combination methods for improving WSD performance. While combination approaches have been studied previously for supervised WSD (Florian et al., 2002), their use in an unsupervised setting is, to our knowledge, novel. We examine several existing and novel combination methods and demonstrate that our combined systems consistently outperform the  state-of-the-art (e.g., McCarthy et al. 2004). Importantly, our WSD algorithms and combination methods do not make use of training material in any way, nor do they use the rst sense information available in WordNet.</Paragraph>
    <Paragraph position="4"> In the following section, we brie y describe the unsupervised WSD algorithms considered in this paper. Then, we present a detailed comparison of their performance on SemCor (Miller et al., 1993).</Paragraph>
    <Paragraph position="5"> Next, we introduce our system combination methods and report on our evaluation experiments. We conclude the paper by discussing our results.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML