File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/06/w06-2503_relat.xml

Size: 3,017 bytes

Last Modified: 2025-10-06 14:15:58

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2503">
  <Title>Relating WordNet Senses for Word Sense Disambiguation</Title>
  <Section position="4" start_page="18" end_page="18" type="relat">
    <SectionTitle>
2 Related Work
</SectionTitle>
    <Paragraph position="0"> There is a significant amount of previous work on grouping WordNet word senses using a number of different information sources, such as predicate argument structure (Palmer et al., forthcoming), information from WordNet (Mihalcea and Moldovan, 2001; Tomuro, 2001) 2 and other lexical resources (Peters and Peters, 1998) translations, system confusability, topic signature and contextual evidence (Agirre and Lopez de Lacalle, 2003). There is also work on grouping senses of other inventories using information in the inventory (Dolan, 1994) along with information retrieval techniques (Chen and Chang, 1998).</Paragraph>
    <Paragraph position="1"> One method presented here (referred to as DIST and described in section 3) relates most to that of Agirre and Lopez de Lacalle (2003). They use contexts of the senses gathered directly from either manually sense tagged corpora, or using instances of &amp;quot;monosemous relatives&amp;quot; which are monosemous words related to one of the target word senses in WordNet. We use contexts of occurrence indirectly. We obtain &amp;quot;nearest neighbours&amp;quot; which occur in similar contexts to the target word. A vector is created for each word sense with a WordNet similarity score between the sense and each nearest neighbour of the target word. 3 While related senses may not have a lot of shared contexts directly, because of sparse data, they may have semantic associations with the same subset of words that share similar distributional contexts with the target word. This method avoids reliance on sense-tagged data or monosemous relatives because the distributional neighbours can be obtained automatically from raw text.</Paragraph>
    <Paragraph position="2"> Our other method relates to the findings of Kohomban and Lee (2005). We use the Jiang-Conrath score (JCN) in the WordNet Similarity Package. This is a distance measure between WordNet senses given corpus frequency counts and the structure of the WordNet hierarchy. It is described in more detail below. Kohomban and Lee (2005) get good results on disambiguation of the SENSEVAL all-words tasks using the 25 unique beginners from the WordNet hierarchy for training a coarse-grained WSD system and then using a first sense heuristic (provided using the frequency  of different words, but leave that for future research. data in SemCor) to determine the fine-grained output. This shows that the structure of WordNet is indeed helpful when selecting coarse senses for WSD. We use the JCN measure to contrast with our DIST measure which uses a combination of distributional neighbours and JCN. We have experimented only with nouns to date, although in principle our method can be extended for other POS.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML