File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0849_intro.xml
Size: 4,747 bytes
Last Modified: 2025-10-06 14:02:32
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0849"> <Title>Class-based Collocations for Word-Sense Disambiguation</Title> <Section position="2" start_page="0" end_page="2" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Supervised systems for word-sense disambiguation (WSD) often rely upon word collocations (i.e., sense-specific keywords) to provide clues on the most likely sense for a word given the context. In the second Senseval competition, these features figured predominantly among the feature sets for the leading systems (Mihalcea, 2002; Yarowsky et al., 2001; Seo et al., 2001).</Paragraph> <Paragraph position="1"> A limitation of such features is that the words selected must occur in the test data in order for the features to apply. To alleviate this problem, class-based approaches augment word-level features with category-level ones (Ide and V'eronis, 1998; Jurafsky and Martin, 2000). When applied to collocational features, this approach effectively uses class labels rather than wordforms in deriving the collocational features.</Paragraph> <Paragraph position="2"> This research focuses on the determination of class-based collocations to improve word-sense disambiguation. We do not address refinement of existing algorithms for machine learning. Therefore, a commonly used decision tree algorithm is employed to combine the various features when performing classification.</Paragraph> <Paragraph position="3"> This paper describes the NMSU-Pitt-UNCA system we developed for the third Senseval competition. Section 2 presents an overview of the feature set used in the system.</Paragraph> <Paragraph position="4"> Section 3 describes how the class-based collocations are derived. Section 4 shows the results over the Senseval-3 data and includes detailed analysis of the performance of the various collocational features.</Paragraph> <Section position="1" start_page="0" end_page="2" type="sub_section"> <SectionTitle> 2SystemOverview </SectionTitle> <Paragraph position="0"> We use a decision tree algorithm for word-sense disambiguation that combines features from the local context of the target word with other lexical features representing the broader context.</Paragraph> <Paragraph position="1"> Figure 1 presents the features that are used in this application. In the first Senseval competition, we used the first two groups of features, Local-context features and Collocational features, with competitive results (O'Hara et al., 2000).</Paragraph> <Paragraph position="2"> Five of the local-context features represent the part of speech (POS) of words immediately surrounding the target word. These five features are POS+-i for i from -2 to +2), where POS+1, for example, represents the POS of the word immediately following the target word.</Paragraph> <Paragraph position="3"> Five other local-context features represent the word tokens immediately surrounding the target word (Word+-i for i from [?]2to+2).</Paragraph> <Paragraph position="4"> Each Word+-i feature is multi-valued; its values correspond to all possible word tokens.</Paragraph> <Paragraph position="5"> There is a collocation feature WordColl s defined for each sense s of the target word. It is a binary feature, representing the absence or presence of any word in a set specifically chosen for s.Awordw that occurs more than once in the training data is included in the collocation set for sense s if the relative percent gain in the conditional probability over the prior probabil- null tion. All collocational features are binary indicators for sense s, except for WordColl</Paragraph> <Paragraph position="7"> [?] 0.20.</Paragraph> <Paragraph position="8"> This threshold was determined to be effective via an optimization search over the Senseval-2 data. WordColl [?] represents a set of non-sensespecific collocations (i.e., not necessarily indicative of any one sense), chosen via the G criteria (Wiebe et al., 1998). In contrast to WordColl s , each of which is a separate binary feature, the words contained in the set WordColl [?] serve as values in a single enumerated feature. These features are augmented with class-based collocational features that represent information about word relationships derived from three separate sources: 1) WordNet (Miller, 1990) hypernym relations (HyperColl); 2) cluster-based word similarity classes (SimilarColl); and 3) relatedness inferred from dictionary definition analysis (DictColl). The information inherent in the sources from which these class-based features are derived allows words that do not occur in the training data context to be considered as collocations during classification. null</Paragraph> </Section> </Section> class="xml-element"></Paper>