File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0856_intro.xml

Size: 5,444 bytes

Last Modified: 2025-10-06 14:02:32

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0856">
  <Title>Pattern Abstraction and Term Similarity for Word Sense Disambiguation: IRST at Senseval-3</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The starting point for our research in the Word Sense Disambiguation (WSD) area was to explore the use of semantic domains in order to solve lexical ambiguity. At the Senseval-2 competition we proposed a new approach to WSD, namely Domain Driven Disambiguation (DDD). This approach consists of comparing the estimated domain of the context of the word to be disambiguated with the domains of its senses, exploiting the property of domains to be features of both texts and words. The domains of the word senses can be either inferred from the learning data or derived from the information in WORDNET DOMAINS.</Paragraph>
    <Paragraph position="1"> For Senseval-3, we re ned the DDD methodology with a fully unsupervised technique - Domain Relevance Estimation (DRE) - for domain detection in texts. DRE is performed by an expectation maximization algorithm for the gaussian mixture model, which is exploited to differentiate relevant domain information in texts from noise. This re ned DDD system was presented in the English all-words task.</Paragraph>
    <Paragraph position="2"> Originally DDD was developed to assess the usefulness of domain information for WSD. Thus it did not exploit other knowledge sources commonly used for disambiguation (e.g. syntactic patterns or collocations). As a consequence the performance of the DDD system is quite good for precision (it disambiguates well the domain words), but as far as recall is concerned it is not competitive compared with other state of the art techniques. On the other hand DDD outperforms the state of the art for unsupervised systems, demonstrating the usefulness of domain information for WSD.</Paragraph>
    <Paragraph position="3"> In addition, the DDD approach requires domain annotations for word senses (for the experiments we used WORDNET DOMAINS, a lexical resource developed at IRST). Like all manual annotations, such an operation is costly (more than two man years have been spent for labeling the whole WORDNET DOMAINS structure) and affected by subjectivity.</Paragraph>
    <Paragraph position="4"> Thus, one drawback of the DDD methodology was a lack of portability among languages and among different sense repositories (unless we have synsetaligned WordNets).</Paragraph>
    <Paragraph position="5"> Besides the improved DDD, our other proposals for Senseval-3 constitute an attempt to overcome these previous issues.</Paragraph>
    <Paragraph position="6"> To deal with the problem of having a domainannotated WORDNET, we experimented with a novel methodology to automatically acquire domain information from corpora. For this aim we estimated term similarity from a large scale corpus, exploiting the assumption that semantic domains are sets of very closely related terms. In particular we implemented a variation of Latent Semantic Analysis (LSA) in order to obtain a vector representation for words, texts and synsets. LSA performs a dimensionality reduction in the feature space describing both texts and words, capturing implicitly the notion of semantic domains required by DDD. In order to perform disambiguation, LSA vectors have been estimated for the synsets in WORDNET. We participated in the English all-words task also with a rst prototype (DDD-LSA) that exploits LSA in- null As far as lexical sample tasks are concerned, we participated in the English, Italian, Spanish, Catalan, and Basque tasks. For these tasks, we explored the direction of pattern abstraction for WSD. Pattern abstraction is an effective methodology for WSD (Mihalcea, 2002). Our preliminary experiments have been performed using TIES, a generalized Information Extraction environment developed at IRST that implements the boosted wrapper induction algorithm (Freitag and Kushmerick, 2000). The main limitation of such an approach is, once more, the integration of different knowledge sources. In particular, paradigmatic information seems hard to be represented in the TIES framework, motivating our decision to exploit kernel methods for WSD.</Paragraph>
    <Paragraph position="7"> Kernel methods is an area of recent interest in Machine Learning. Kernels are similarity functions between instances that allows to integrate different knowledge sources and to model explicitly linguistic insights inside the powerful framework of support vector machine classi cation. For Senseval-3 we implemented the Kernels-WSD system, which exploits kernel methods to perform the following operations: (i) pattern abstraction; (ii) combination of different knowledge sources, in particular domain information and syntagmatic information; (iii) integration of unsupervised term proximity estimation in the supervised framework.</Paragraph>
    <Paragraph position="8"> The paper is structured as follows. Section 2 introduces LSA and its relations with semantic domains. Section 3 presents the systems for the English all-words task (i.e. DDD and DDD-LSA). In section 4 our supervised approaches are reported.</Paragraph>
    <Paragraph position="9"> In particular the TIES system is described in section 4.1, while the approach based on kernel methods is discussed in section 4.2.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML