File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/w00-0704_concl.xml
Size: 2,191 bytes
Last Modified: 2025-10-06 13:52:50
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-0704"> <Title>The Role of Algorithm Bias vs Information Source in Learning Algorithms for Morphosyntactic Disambiguation</Title> <Section position="7" start_page="23" end_page="23" type="concl"> <SectionTitle> 7 Concluding Remarks </SectionTitle> <Paragraph position="0"> A systematic comparison between two state-of-the-art tagging systems (maximum entropy and memory-based learning) was presented. By carefully controlling the information sources available to the learning algorithms when used as a tagger generator, we were able to show that, although there certainly are differences between the inherent bias of the algorithms, these differences account for less variability in tagging accuracy than suggested in previous comparisons (e.g. van Halteren et al. (1998)).</Paragraph> <Paragraph position="1"> Even though overall tagging accuracy of both learning algorithms turns out to be very similar, differences can be observed in terms of accuracy on known and unknown words separately, but also in the differences in the (erroneous) tagging behaviour the two learning algorithms exhibit.</Paragraph> <Paragraph position="2"> Furthermore, evidence can be found that given the same information source, different learning algorithms, and also different instantiations of the same learning algorithm, yield small, but significant differences in tagging accuracy. This may be in line with theoretical work by Roth (1998);Roth (1999) in which both maximum entropy modeling and memory-based learning (among other learning algorithms) are shown to search for a decision surface which is a linear function in the feature space. The results put forward in this paper support the claim that, although the linear separator found can be different for different learning algorithms, the feature space used is more important.</Paragraph> <Paragraph position="3"> We also showed that which information sources, algorithmic parameters, and even algorithm variants are optimal depends on a complex interaction of learning algorithm, task, and data set, and should accordingly be decided upon by cross-validation.</Paragraph> </Section> class="xml-element"></Paper>