File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/95/p95-1026_concl.xml
Size: 1,591 bytes
Last Modified: 2025-10-06 13:57:28
<?xml version="1.0" standalone="yes"?> <Paper uid="P95-1026"> <Title>UNSUPERVISED WORD SENSE DISAMBIGUATION RIVALING SUPERVISED METHODS</Title> <Section position="15" start_page="194" end_page="195" type="concl"> <SectionTitle> 10 Conclusion </SectionTitle> <Paragraph position="0"> In essence, our algorithm works by harnessing several powerful, empirically-observed properties of language, namely the strong tendency for words to exhibit only one sense per collocation and per discourse. It attempts to derive maximal leverage from these properties by modeling a rich diversity of collocational relationships. It thus uses more discriminating information than available to algorithms treating documents as bags of words, ignoring relative position and sequence. Indeed, one of the strengths of this work is that it is sensitive to a wider range of language detail than typically captured in statistical sense-disambiguation algorithms.</Paragraph> <Paragraph position="1"> Also, for an unsupervised algorithm it works surprisingly well, directly outperforming Schiitze's unsupervised algorithm 96.7 % to 92.2 %, on a test of the same 4 words. More impressively, it achieves nearly the same performance as the supervised algorithm given identical training contexts (95.5 % vs. 96.1%) , and in some cases actually achieves superior performance when using the one-sense-per-discourse constraint (96.5 % vs. 96.1%). This would indicate that the cost of a large sense-tagged training corpus may not be necessary to achieve accurate word-sense disambiguation.</Paragraph> </Section> class="xml-element"></Paper>