File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/w97-0322_concl.xml
Size: 1,551 bytes
Last Modified: 2025-10-06 13:57:52
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0322"> <Title>Distinguishing Word Senses in Untagged Text</Title> <Section position="10" start_page="204" end_page="205" type="concl"> <SectionTitle> 8 Conclusions </SectionTitle> <Paragraph position="0"> Supervised learning approaches to word-sense disambiguation fall victim to the knowledge acquisition bottleneck. The creation of sense tagged text sufficient to serve as a training sample is expensive and time consuming. This bottleneck is eliminated through the use of unsupervised learning approaches which distinguish the sense of a word based only on features that can be automatically identified.</Paragraph> <Paragraph position="1"> In this study, we evaluated the performance of three unsupervised learning algorithms on the dis- null ambiguation of 13 words in naturally occurring text.</Paragraph> <Paragraph position="2"> The algorithms are McQuitty's similarity analysis, Ward's minimum-variance method, and the EM algorithm. Our findings show that each of these algorithms is negatively impacted by highly skewed sense distributions. Our methods and feature sets were found to be most successful in disambiguating nouns rather than adjectives or verbs. Overall, the most successful of our procedures was McQuitty's similarity analysis in combination with a high dimensional feature set. In future work, we will investigate modifications of these algorithms and feature set selection that are more effective on highly skewed sense distributions.</Paragraph> </Section> class="xml-element"></Paper>