File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/w04-0846_abstr.xml
Size: 1,532 bytes
Last Modified: 2025-10-06 13:43:48
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0846"> <Title>Context Clustering for Word Sense Disambiguation Based on Modeling Pairwise Context Similarities</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> Traditionally, word sense disambiguation (WSD) involves a different context model for each individual word. This paper presents a new approach to WSD using weakly supervised learning. Statistical models are not trained for the contexts of each individual word, but for the similarities between context pairs at category level. The insight is that the correlation regularity between the sense distinction and the context distinction can be captured at category level, independent of individual words. This approach only requires a limited amount of existing annotated training corpus in order to disambiguate the entire vocabulary. A context clustering scheme is developed within the Bayesian framework. A maximum entropy model is then trained to represent the generative probability distribution of context similarities based on heterogeneous features, including trigger words and parsing structures. Statistical annealing is applied to derive the final context clusters by globally fitting the pairwise context similarity distribution. Benchmarking shows that this new approach significantly outperforms the existing WSD systems in the unsupervised category, and rivals supervised WSD systems.</Paragraph> </Section> class="xml-element"></Paper>