File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/c04-1112_abstr.xml

Size: 1,008 bytes

Last Modified: 2025-10-06 13:43:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1112">
  <Title>A Lemma-Based Approach to a Maximum Entropy Word Sense Disambiguation System for Dutch</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper, we present a corpus-based supervised word sense disambiguation (WSD) system for Dutch which combines statistical classification (maximum entropy) with linguistic information. Instead of building individual classifiers per ambiguous wordform, we introduce a lemma-based approach. The advantage of this novel method is that it clusters all inflected forms of an ambiguous word in one classifier, therefore augmenting the training material available to the algorithm. Testing the lemma-based model on the Dutch SENSEVAL-2 test data, we achieve a significant increase in accuracy over the wordform model. Also, the WSD system based on lemmas is smaller and more robust.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML