File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-1097_intro.xml
Size: 3,577 bytes
Last Modified: 2025-10-06 14:01:24
<?xml version="1.0" standalone="yes"?> <Paper uid="C02-1097"> <Title>Word Sense Disambiguation using Static and Dynamic Sense Vectors</Title> <Section position="2" start_page="2" end_page="4" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> It is popular in WSD to use contextual information in training data (Agirre, et al., ; Escudero, et al., 2000; Gruber, 1991; Schutze, 1998). Co-occurring words within a limited window-sized context support one sense among the semantically ambiguous ones of the word. The problem is to find the most effective patterns in order to capture the right sense. It is true that they have similar context and co-occurrence information when words are used with the same sense (Rigau, et al., 1997). It is also true that contextual words nearby an ambiguous word give more effective patterns or features than those far from it (Chen, et al., 1998). In this paper, we represent each sense of a word as a vector in word space. First, contextual words in the training sense tagged data are represented as context vectors. Then, ambiguous word in a given context of 'Wt'. This context may consist of several sentences and it is represented by 'contextual words'.</Paragraph> <Paragraph position="1"> Agirre et al., (1996) defines a term 'conceptual density' based on how many nodes are hit between WordNet node and target words+contexts. Unlike 'Conceptual density', 'local density' used in this paper does not use any semantic net like WordNet but use only the contextual words surrounding the given target word..</Paragraph> <Paragraph position="2"> In this paper, the English SENSEVAL-2 data for the lexical sample task is used as training sense tagged data. It is sampled from BNC-2, the Penn Treebank (comprising components from the Wall Street Journal, Brown, and IBM manuals) and so on. All items in the lexical sample are specific to one word class; noun, verb or adjective. Training sense tagged data is composed of training samples that support a certain sense of a target word. They contain the words in the context vector are weighted with local density. Then, each sense of a target word can be represented as a sense vector, which is the centroid of the context vectors in word space.</Paragraph> <Paragraph position="3"> However, if training samples contain noise, it is difficult to capture effective patterns for WSD (Atsushi, et al., 1998). Word occurrences in the context are too diverse to capture the right pattern for WSD. It means that the dimension of contextual words will be very large when we will use all words in the training samples for WSD. To avoid the problems, we use an automatized hybrid version of selective sampling that will be called &quot;automatic selective sampling&quot;. This automatization is based on cosine similarity for the selection. For a given target word and its context, this method retrieves N-best relevant training samples using the cosine similarity. Using them, we can construct another sense vectors for each sense of the target word. The relevant training samples are retrieved by comparing cosine similarities between given contexts and indexed context vectors of training samples. The 'automatic selective sampling' method makes it possible to use traning samples which have higher discriminative power.</Paragraph> <Paragraph position="4"> This paper is organized as follows: section 2 shows details of our method. Section 3 deals with experiments. Conclusion and future works are drawn in sections 4.</Paragraph> </Section> class="xml-element"></Paper>