File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/c04-1132_concl.xml
Size: 2,982 bytes
Last Modified: 2025-10-06 13:53:57
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1132"> <Title>Learning a Robust Word Sense Disambiguation Model using Hypernyms in Definition Sentences</Title> <Section position="8" start_page="10" end_page="10" type="concl"> <SectionTitle> 6 Related Work </SectionTitle> <Paragraph position="0"> As described in Section 1, the goal of this project was to improve robustness of the WSD system. One of the promising ways to construct a robust WSD system is unsupervised learning as with the EM algorithm (Manning and Sch&quot;utze, 1999), i.e. training a WSD classifier from an unlabeled data set. On the other hand, our approach is to use a machine readable dictionary in addition to a corpus as knowledge resources for WSD. Notice that we used hypernyms of definition sentences in a dictionary to train the Naive Bayes classifier, and this process worked well for words which did not occur frequently in the corpus. However, we did not compare our method and the unsupervised learning method empirically. This will be one of our future projects.</Paragraph> <Paragraph position="1"> Using hypernyms of definition sentences is similar to using semantic classes derived from a thesaurus. One of the advantages of our method is that a thesaurus is not obligatory when word senses are defined according to a machine readable dictionary. Furthermore, our method is to train the probabilistic model that predicts a hypernym of a word, while most previous approachesusesemanticclassesasfeatures(i.e., null the condition of the posterior probability in the case of the Naive Bayes model). In facts, we also use features associated with semantic classes derived from the thesaurus, C ), as described in Section 2.</Paragraph> <Paragraph position="2"> Several previous studies have used both a corpus and a machine readable dictionary for WSD (Litkowski, 2002; Rigau et al., 1997; Stevenson and Wilks, 2001). The difference between those methods and ours is the way we use information derived from the dictionary for WSD. Training the probabilistic model that predicts a hypernym in a dictionary is our own approach. However, these various methods are not in competition with our method. In fact, the robustness of the WSD system would be even more improved by combining these methods with that described in this paper.</Paragraph> <Paragraph position="3"> 7Conclusion This paper has proposed a method to develop a robust WSD system. We combined a WSD classifier obtained by supervised learning for high frequency words and a classifier using hypernyms in definition sentences in a dictionary for low frequency words. Experimental results showed that both recall and applicability were remarkably improved with our method. In future, we plan to investigate the optimum way to combine these two classifiers or to train a single probabilistic model using hypernyms in definition sentences, which is suitable for both high and low frequency words.</Paragraph> </Section> class="xml-element"></Paper>