File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-0846_evalu.xml

Size: 1,746 bytes

Last Modified: 2025-10-06 13:59:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0846">
  <Title>Context Clustering for Word Sense Disambiguation Based on Modeling Pairwise Context Similarities</Title>
  <Section position="6" start_page="2" end_page="2" type="evalu">
    <SectionTitle>
5 Benchmarking and Conclusion
</SectionTitle>
    <Paragraph position="0"> To enter the Senseval-3 evaluation, we implemented the following procedure to map the context clusters to Senseval-3 standards: i) process the Senseval-3 training corpus and testing corpus using our parser; ii) for each word to be benchmarked, retrieve the related contexts from the corpora and cluster them; iii) Based on 10% of the sense tags in the Senseval-3 training corpus (10% data correspond roughly to an average of 2-3 instances for each sense), the context cluster is mapped onto the most frequent WSD sense associated with the cluster members. By design, the context clusters correspond to distinct senses, therefore, we do not allow multiple context clusters to be mapped onto one sense. In case multiple clusters correspond to one sense, only the largest cluster is retained; iv), each instance in the testing corpus is tagged with the same sense as the one to which its context cluster corresponds.</Paragraph>
    <Paragraph position="1"> We are not able to compare our performance with other systems in Senseval-3 because at the time of writing, the Senseval-3 evaluation results are not publicly available. As a note, compared with the Senseval-2 English Lexical Sample evaluation, the benchmarks of our new algorithm (Table 1) are significantly above the performance of the WSD systems in the unsupervised category, and rival the performance of the supervised WSD systems.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML