File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-0108_concl.xml
Size: 1,764 bytes
Last Modified: 2025-10-06 13:54:09
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0108"> <Title>A Comparison of Two Different Approaches to Morphological Analysis of Dutch</Title> <Section position="6" start_page="0" end_page="0" type="concl"> <SectionTitle> 6 Conclusion </SectionTitle> <Paragraph position="0"> Current work in the project focuses on further developing the morphological analyzer by trying to provide part-of-speech tags and hierarchical bracketing properties to the segmented morpheme sequences in order to comply with the type of analysis found in the morphological database of CELEX. We will further try to incorporate other machine learning algorithms like maximum entropy and support vector machines to see if it is at all possible to overcome the current accuracy threshold. Algorithmic parameter 'degradation' will be attempted to entice more greedy morpheme boundary placement in the raw output, in the hope that the post-processing mechanism will be able to properly rank the extra alternative segmentations.</Paragraph> <Paragraph position="1"> Finally, we will experiment on the full CELEX data set (including inflection) as featured in Van den Bosch and Daelemans (1999).</Paragraph> <Paragraph position="2"> In this paper we described two data-driven systems for morphological analysis. Trained and tested on the same data set, these systems achieve a similar accuracy, but do exhibit quite different processing properties. Even though these systems were originally designed to function as language models in the context of a modular architecture for speech recognition, they constitute accurate and elegant morphological analyzers in their own right, which can be incorporated in other natural language applications as well.</Paragraph> </Section> class="xml-element"></Paper>