File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/93/h93-1051_intro.xml
Size: 1,244 bytes
Last Modified: 2025-10-06 14:05:30
<?xml version="1.0" standalone="yes"?> <Paper uid="H93-1051"> <Title>CORPUS-BASED STATISTICAL SENSE RESOLUTION</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. INTRODUCTION </SectionTitle> <Paragraph position="0"> The goal of this study is to systematically explore the effects of such variables as the number of senses per word and the number of training examples per sense on corpus-based statistical sense resolution methods. To enable us to study the effects of the number of word senses, we selected the highly polysemous noun line, which has 25 senses in WordNet. 1 Automatic sense resolution systems need to resolve highly polysemous words. As Zipf \[2\] pointed out in 1945, frequently occurring words tend to be polysemous.</Paragraph> <Paragraph position="1"> The words encountered in a given text will have far greater polysemy than one would assume by simply taking the overall percentage of po\]ysemous words in the language. Even though 86% of the nouns in WordNet have a single sense, the mean number of WordNet senses per word for the one hundred most frequently occurring nouns in the Brown Corpus is 5.15, with only eight words having a single sense.</Paragraph> </Section> class="xml-element"></Paper>