File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-2007_intro.xml

Size: 5,644 bytes

Last Modified: 2025-10-06 14:04:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2007">
  <Title>Word Sense Disambiguation Using Automatically Translated Sense Examples</Title>
  <Section position="2" start_page="0" end_page="45" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Results from recent Senseval workshops have shown that supervised Word Sense Disambiguation (WSD) systems tend to outperform their unsupervised counterparts. However, supervised systems rely on large amounts of accurately sense-annotated data to yield good results and such resources are very costly to produce. It is difficult for supervised WSD systems to perform well and reliably on words that do not have enough sense-tagged training data. This is the so-called knowledge acquisition bottleneck.</Paragraph>
    <Paragraph position="1"> To overcome this bottleneck, unsupervised WSD approaches have been proposed. Among them, systems under the multilingual paradigm have shown great promise (Gale et al., 1992; Dagan and Itai, 1994; Diab and Resnik, 2002; Ng et al., 2003; Li and Li, 2004; Chan and Ng, 2005; Wang and Carroll, 2005). The underlying hypothesis is that mappings between word forms and meanings can be different from language to language. Much work have been done on extracting sense examples from parallel corpora for WSD. For example, Ng et al. (2003) proposed to train a classifier on sense examples acquired from word-aligned English-Chinese parallel corpora. They grouped senses that share the same Chinese translation, and then the occurrences of the word on the English side of the parallel corpora were considered to have been disambiguated and &amp;quot;sense tagged&amp;quot; by the appropriate Chinese translations. Their system was evaluated on the nouns in Senseval-2 English lexical sample dataset, with promising results. Their follow-up work (Chan and Ng, 2005) has successfully scaled up the approach and achieved very good performance on the Senseval-2 English all-word task.</Paragraph>
    <Paragraph position="2"> Despite the promising results, there are problems with relying on parallel corpora. For example, there is a lack of matching occurrences for some Chinese translations to English senses. Thus gathering training examples for them might be difficult, as reported in (Chan and Ng, 2005). Also, parallel corpora themselves are rare resources and not available for many language pairs.</Paragraph>
    <Paragraph position="3"> Some researchers seek approaches using mono-lingual resources in a second language and then try to map the two languages using bilingual dictionaries. For example, Dagan and Itai (1994) carried out WSD experiments using monolingual corpora, a bilingual lexicon and a parser for the source language. One problem of this method is that  for many languages, accurate parsers do not exist.</Paragraph>
    <Paragraph position="4"> Wang and Carroll (2005) proposed to use mono-lingual corpora and bilingual dictionaries to automatically acquire sense examples. Their system was unsupervised and achieved very promising results on the Senseval-2 lexical sample dataset.</Paragraph>
    <Paragraph position="5"> Their system also has better portability, i.e., it runs on any language pair as long as a bilingual dictionary is available. However, sense examples acquired using the dictionary-based word-by-word translation can only provide &amp;quot;bag-of-words&amp;quot; features. Many other features useful for machine learning (ML) algorithms, such as the ordering of words, part-of-speech (POS), bigrams, etc., have been lost. It could be more interesting to translate Chinese text snippets using machine translation (MT) software, which would provide richer contextual information that might be useful for WSD learners. Although MT systems themselves are expensive to build, once they are available, they can be used repeatedly to automatically generate as much data as we want. This is an advantage over relying on other expensive resources such as manually sense-tagged data and parallel copora, which are limited in size and producing additional data normally involves further costly investments.</Paragraph>
    <Paragraph position="6"> We carried out experiments on acquiring sense examples using both MT software and a bilingual dictionary. When we had the two sets of sense examples ready, we trained a ML classifier on them and then tested them on coarse-grained and fine-grained gold standard WSD datasets, respectively. We found that on both test datasets the classifier using MT translated sense examples outperformed the one using those translated by a dictionary, given the same amount of training examples used on each word sense. This confirms our assumption that a richer feature set, although from a noisy data source, such as machine translated text, might help ML algorithms. In addition, both systems performed very well comparing to other state-of-the-art WSD systems. As we expected, our system is particularly good on coarse-grained disambiguation. Being an unsupervised approach, it achieved a performance competitive to state-of-the-art supervised systems.</Paragraph>
    <Paragraph position="7"> This paper is organised as follows: Section 2 revisits the process of acquiring sense examples proposed in (Wang and Carroll, 2005) and then describes our adapted approach. Section 3 outlines resources, the ML algorithm and evaluation metrics that we used. Section 4 and Section 5 detail experiments we carried out on gold standard datasets. We also report our results and error analysis. Finally, Section 6 concludes the paper and draws future directions.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML