File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-2013_concl.xml
Size: 1,837 bytes
Last Modified: 2025-10-06 13:54:20
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2013"> <Title>WordNet-based Text Document Clustering</Title> <Section position="6" start_page="2" end_page="2" type="concl"> <SectionTitle> 6 Conclusions </SectionTitle> <Paragraph position="0"> Themainfindingofthisworkisthatincluding synonyms and hypernyms, disambiguated only by PoS tags, is not successful in improving clustering effectiveness. This could be attributed to the noise introduced by all incorrect senses that are retrieved from WordNet. It appears that disambiguation by PoS alone is insufficient to reveal the full potential of including background knowledge. One obviously impractical alternative would be manual sense disambiguation. The automated approach of only using the most common sense adopted by Hotho et al.(2003b) seems more realistic yet beneficial.</Paragraph> <Paragraph position="1"> When comparing the use of different levels of hypernyms, the results indicate that including only five levels is better than including all. A possible explanation of this is that the terms become too general when all hypernym levels are included.</Paragraph> <Paragraph position="2"> Further research is needed to determine whether this way of document clustering can be improved by appropriately selecting a sub-set of the synonyms and hypernyms used here. There is a number of corpus-based approaches to word-sense disambiguation (Resnik and Yarowsky, 2000), which could be used for this purpose.</Paragraph> <Paragraph position="3"> Theotherpointofinterestthatcouldbefurther analysed is to find out why using five levels of hypernyms produces better results than using all levels of hypernyms. It would be interesting to see whether this effect persists when better disambiguation is used to determine 'correct' word senses.</Paragraph> </Section> class="xml-element"></Paper>