File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/98/p98-1029_abstr.xml

Size: 2,704 bytes

Last Modified: 2025-10-06 13:49:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1029">
  <Title>Classifier Combination for Improved Lexical Disambiguation</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> One of the most exciting recent directions in machine learning is the discovery that the combination of multiple classifiers often results in significantly better performance than what can be achieved with a single classifier. In this paper, we first show that the errors made from three different state of the art part of speech taggers are strongly complementary. Next, we show how this complementary behavior can be used to our advantage. By using contextual cues to guide tagger combination, we are able to derive a new tagger that achieves performance significantly greater than any of the individual taggers.</Paragraph>
    <Paragraph position="1"> Introduction Part of speech tagging has been a central problem in natural language processing for many years. Since the advent of manually tagged corpora such as the Brown Corpus and the Penn Treebank (Francis(1982), Marcus(1993)), the efficacy of machine learning for training a tagger has been demonstrated using a wide array of techniques, including: Markov models, decision trees, connectionist machines, transformations, nearest-neighbor algorithms, and maximum entropy</Paragraph>
    <Paragraph position="3"> )). All of these methods seem to achieve roughly comparable accuracy.</Paragraph>
    <Paragraph position="4"> The fact that most machine-learning-based taggers achieve comparable results could be attributed to a number of causes. It is possible that the 80/20 rule of engineering is applying: a certain number of tagging instances are relatively simple to disambiguate and are therefore being successfully tagged by all approaches, while another percentage is extremely difficult to disambiguate, requiring deep linguistic knowledge, thereby causing all taggers to err. Another possibility could be that all of the different machine learning techniques are essentially doing the same thing. We know that the features used by the different algorithms are very similar, typically the words and tags within a small window from the word being tagged. Therefore it could be possible that they all end up learning the same information, just in different forms.</Paragraph>
    <Paragraph position="5"> In the field of machine learning, there have been many recent results demonstrating the efficacy of combining classifiersJ In this paper we explore whether classifier combination can result in an overall improvement in lexical disambiguation accuracy.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML