File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/w99-0624_concl.xml
Size: 2,380 bytes
Last Modified: 2025-10-06 13:58:34
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0624"> <Title>Lexical ambiguity and Information Retrieval revisited</Title> <Section position="8" start_page="199" end_page="199" type="concl"> <SectionTitle> 7 Conclusions </SectionTitle> <Paragraph position="0"> We have revised a number of previous experiments regarding lexical ambiguity and Information Retrieval, taking advantage of the manual annotations in our IR-Semcor collection. Within the limitations of our collection (mainly its reduced size), we can extract some conclusions: * Sense ambiguity could be more relevant to Information Retrieval than suggested by Sanderson's experiments with pseudo-words. In particular, his estimation that 90% accuracy is needed to benefit from Word Sense Disambiguation techniques does not hold for real ambiguous words in our collection.</Paragraph> <Paragraph position="1"> * Part-Of-Speech information, even if manually annotated, seems too discriminatory for Information Retrieval purposes. This clarifies the results obtained by Krovetz with an automatic POS tagger.</Paragraph> <Paragraph position="2"> * Taking phrases as indexing terms may decrease retrieval efficiency. Phrase indexing could be more useful, anyway, when the queries demands a very precise kind of documents, and when the number of available documents is high.</Paragraph> <Paragraph position="3"> In our opinion, lexical ambiguity will become a central topic for Information Retrieval as the impor- null I , i I I I i , No phrase indexing --e-- k Phrase indexing -~',~ #phrase operator in queries -G-- ',~ Random baseline ..x ..... Y=, l I I I I I I I 20 30 40 50 60 70 80 90 1 O0 recall that the increasing multilinguality of Internet is already producing). Although the problem of Word Sense Disambigu~ation is still far from being solved, we believe that specific disambiguation for (Cross-Language) Information Retrieval could achieve good results by weight!ng candidate senses without a special commitment to Part-Of-Speech differentiation. An interesting point is that the WordNet structure is not well suited for IR in this respect, as it keeps noun, verb and adjective synsets completely unrelated. The EuroWordNet multilingual database (Vossen, 1998), on the other hand, features crosspart-of-speech semantic relations that could be useful in an IR setting.</Paragraph> </Section> class="xml-element"></Paper>