File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1081_intro.xml
Size: 2,139 bytes
Last Modified: 2025-10-06 14:02:24
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-1081"> <Title>A Kernel PCA Method for Superior Word Sense Disambiguation Dekai WU1 Weifeng SU Marine CARPUAT dekai@cs.ust.hk weifeng@cs.ust.hk marine@cs.ust.hk</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Achieving higher precision in supervised word sense disambiguation (WSD) tasks without resorting to ad hoc voting or similar ensemble techniques has become somewhat daunting in recent years, given the challenging benchmarks set by na&quot;ive Bayes models (e.g., Mooney (1996), Chodorow et al. (1999), Pedersen (2001), Yarowsky and Florian (2002)) as well as maximum entropy models (e.g., Dang and Palmer (2002), Klein and Manning (2002)). A good foundation for comparative studies has been established by the Senseval data and evaluations; of particular relevance here are the lexical sample tasks from Senseval-1 (Kilgarriff and Rosenzweig, 1999) and Senseval-2 (Kilgarriff, 2001).</Paragraph> <Paragraph position="1"> We therefore chose this problem to introduce an efficient and accurate new word sense disambiguation approach that exploits a nonlinear Kernel PCA technique to make predictions implicitly based on generalizations over feature combinations. The pected to be highly attractive from the standpoint of natural language processing in general.</Paragraph> <Paragraph position="2"> In the following sections, we first analyze the potential of nonlinear principal components with respect to the task of disambiguating word senses. Based on this, we describe a full model for WSD built on KPCA. We then discuss experimental results confirming that this model outperforms state-of-the-art published models for Senseval-related lexical sample tasks as represented by (1) na&quot;ive Bayes models, as well as (2) maximum entropy models. We then consider whether other kernel methods--in particular, the popular SVM model-are equally competitive, and discover experimentally that KPCA achieves higher accuracy than the SVM model.</Paragraph> </Section> class="xml-element"></Paper>