File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/c04-1112_concl.xml
Size: 1,655 bytes
Last Modified: 2025-10-06 13:53:58
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1112"> <Title>A Lemma-Based Approach to a Maximum Entropy Word Sense Disambiguation System for Dutch</Title> <Section position="8" start_page="0" end_page="0" type="concl"> <SectionTitle> 7 Conclusion and Future Work </SectionTitle> <Paragraph position="0"> In this paper, we have introduced a lemma-based approach for a statistical WSD system using maximum entropy and a number of linguistic sources of information. This novel approach uses the advantage of more concise and more generalizable in- null formation contained in lemmas as key feature: classifiers for individual ambiguous words are built on the basis of their lemmas, instead of wordforms as has traditionally been done. Therefore, more training material is available to each classifier and the resulting WSD system is smaller and more robust.</Paragraph> <Paragraph position="1"> The lemma-based approach has been tested on the Dutch SENSEVAL-2 data set and resulted in a significant improvement of the accuracy achieved over the system using the traditional wordform based approach. In comparison to earlier results with a Memory-Based WSD system, the lemma-based approach performs the same, involving less work (no parameter optimization).</Paragraph> <Paragraph position="2"> A possible extension of the present approach is to include more specialized feature selection and also to optimize the settings for each ambiguous word-form instead of adopting the same strategy for all words in the corpus. Furthermore, we would like to test the lemma-based approach in a multi-classifier voting scheme.</Paragraph> </Section> class="xml-element"></Paper>