File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-0837_metho.xml
Size: 9,487 bytes
Last Modified: 2025-10-06 14:09:12
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0837"> <Title>Using Automatically Acquired Predominant Senses for Word Sense Disambiguation</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 English all-words data </SectionTitle> <Paragraph position="0"> of the first sense listed in WordNet gives 65% precision and recall for all PoS on these items. The fourth column on table 1 gives the random base-line which reflects the polysemy of the data. Table 2 shows results obtained when we use the most common sense for an item and PoS using the frequency in the SENSEVAL-2 English all-words data itself. Recall is lower than precision since we only use the heuristic on lemmas which have occurred more than once and where there is one sense which has a greater frequency than the others, apart from trivial monosemous cases. 2 Precision is higher in table 2 than in table 1 reflecting the difference between an a priori first sense determined by Sem-Cor, and an upper bound on the performance of this heuristic for this data. This upper bound is quite high because of the very skewed sense distributions in the test data itself. The upper bound for a document, or document collection, will depend on how homogenous the content of that document collection is, and the skew of the word sense distributions therein. Indeed, the bias towards one sense for a given word in a given document or discourse was observed by Gale et al. (1992).</Paragraph> <Paragraph position="1"> Whilst a first sense heuristic based on a sense-tagged corpus such as SemCor is clearly useful, there is a case for obtaining a first, or predominant, sense from untagged corpus data so that a WSD SEVAL-2 English all-words data system can be tuned to a given genre or domain (McCarthy et al., 2004) and also because there will be words that occur with insufficient frequency in the hand-tagged resources available. SemCor comprises a relatively small sample of 250,000 words.</Paragraph> <Paragraph position="2"> There are words where the first sense in WordNet is counter-intuitive, because this is a small sample, and because where the frequency data does not indicate a first sense, the ordering is arbitrary. For example the first sense of tiger in WordNet is audacious person whereas one might expect that carnivorous animal is a more common usage.</Paragraph> <Paragraph position="3"> Assuming that one had an accurate WSD system then one could obtain frequency counts for senses and rank them with these counts. However, the most accurate WSD systems are those which require manually sense tagged data in the first place, and their accuracy depends on the quantity of training examples (Yarowsky and Florian, 2002) available. We are investigating a method of automatically ranking WordNet senses from raw text, with no reliance on manually sense-tagged data such as that in SemCor.</Paragraph> <Paragraph position="4"> The paper is structured as follows. We discuss our method in the following section. Section 3 describes an experiment using predominant senses acquired from the BNC evaluated on the SENSEVAL-2 English all-words task. In section 4 we present our results on the SENSEVAL-3 English all-words task.</Paragraph> <Paragraph position="5"> We discuss related work in section 5 and conclude in section 6.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Method </SectionTitle> <Paragraph position="0"> The method is described in (McCarthy et al., 2004), which we summarise here. We acquire thesauruses for nouns, verbs, adjectives and adverbs based on the method proposed by Lin (1998) using grammatical relations output from the RASP parser (Briscoe and Carroll, 2002). The grammatical contexts used are listed in table 3, but there is scope for extending or restricting the contexts for a given PoS.</Paragraph> <Paragraph position="1"> We use the thesauruses for ranking the senses of the target words. Each target word (a2 ) e.g. plant in the thesaurus is associated with a list of nearest PoS grammatical contexts Noun verb in direct object or subject relation adjective or noun modifier Verb noun as direct object or subject Adjective modified noun, modifing adverb Adverb modified adjective or verb the thesauruses neighbours (a3a5a4a7a6a9a8a11a10 ) with distributional similarity scores (a12a14a13a15a13a17a16a18a2a20a19a21a3a5a4a15a22 ) e.g. factory 0.28, refinery 0.17, tree 0.14 etc... 3 Distributional similarity is a measure indicating the degree that two words, a word and its neighbour, occur in similar contexts. The neighbours reflect the various senses of the word (a2a23a13a25a24a26a6a27a13a15a28a29a3a30a13a29a28a15a13a17a16a18a2a31a22 ). We assume that the quantity and similarity of the neighbours pertaining to different senses will reflect the relative dominance of the senses. This is because there will be more relational data for the more prevalent senses compared to the less frequent senses. We relate the neighbours to these senses by a semantic similarity measure using the WordNet similarity package (Patwardhan and Pedersen, 2003) (a2a32a3a33a13a15a13a17a16a18a2a23a13 a24 a19a34a3a5a4a15a22 ), where the sense of the neighbour (a3a33a13a25a35 ) that maximises the similarity to a2a32a13 a24 is selected. The measure used for ranking the senses of a word is calculated using the distributional similarity scores of the neighbours weighted by the semantic similarity between the neighbour and the sense of the target word as shown in equation 1. The frequency data required by the semantic similarity measure (jcn (Jiang and Conrath, 1997)) is obtained using the BNC so that no hand-tagged data is used and our method is fully unsupervised.</Paragraph> <Paragraph position="2"> We rank each sense a2a23a13a25a24a36a6a9a13a15a28a29a3a33a13a15a28a15a13a17a16a18a2a37a22 using:</Paragraph> <Paragraph position="4"> For SENSEVAL-3 we obtained thesaurus entries for all nouns, verbs, adjectives and adverbs using parsed text from the 90 million words of written English from the BNC. We created entries for words which occurred at least 10 times in frames involving the grammatical relations listed in table 3. We used on the SENSEVAL-2 English all-words data 50 nearest neighbours for ranking, since this threshold has given good results in other experiments.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Performance of the automatically </SectionTitle> <Paragraph position="0"> acquired First sense on SENSEVAL-2 We acquired sense rankings for polysemous nouns in WordNet 1.7.1 that occurred with a71 10 frames. This version was used in preparation for SENSE-VAL-3. We then applied the predominant sense heuristic from the automatically acquired rankings to the SENSEVAL-2 data. 4 Recall and precision figures are calculated using the SENSEVAL-2 scorer; recall is therefore particularly low for any given PoS in isolation since this is calculated over the entire corpus.</Paragraph> <Paragraph position="1"> The method produces lower results for verbs than for other PoS, this is in line with the lower performance of a manually acquired first sense heuristic and also reflects the greater polysemy of verbs shown by the lower random baseline as in tables 1 and 2.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Results from SENSEVAL-3 </SectionTitle> <Paragraph position="0"> For SENSEVAL-3 we used the predominant senses from the automatic rankings for i) all PoS (autoPS) and ii) all PoS except verbs (autoPSNVs). The results are given in table 5. The &quot;without U&quot; results are used since the lack of a response by our system occurred when there were no nearest neighbours and so no ranking was available for selecting a predominant sense, rather than as an indication that the sense is missing from WordNet. Our system performs well in comparison with the results in SENSEVAL-2 for unsupervised systems which do not use manually labelled data such as SemCor.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Related Work </SectionTitle> <Paragraph position="0"> There is some related work on ranking the senses of words. Buitelaar and Sacaleanu (2001) have previously explored ranking and selection of synsets on the SENSEVAL-3 English all-words data in GermaNet for specific domains using the words in a given synset, and those related by hyponymy, and a term relevance measure taken from information retrieval. Buitelaar and Bogdan have evaluated their method on identifying domain specific concepts, rather than for WSD. In recent work, Lapata and Brew (2004) obtain predominant senses of verbs occurring in subcategorization frames, where the senses of verbs are defined using Levin classes (Levin, 1993). They demonstrate that these priors are useful for WSD of verbs.</Paragraph> <Paragraph position="1"> Our ranking method is related to work by Pantel and Lin (2002) who use automatic thesauruses for discovering word senses from corpora, rather than for detecting predominance. In their work, the lists of neighbours are themselves clustered to bring out the various senses of the word. They evaluate using a WordNet similarity measure to determine the precision and recall of these discovered classes with respect to WordNet synsets.</Paragraph> </Section> class="xml-element"></Paper>