File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1075_intro.xml

Size: 4,157 bytes

Last Modified: 2025-10-06 14:02:22

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1075">
  <Title>Multi-Criteria-based Active Learning for Named Entity Recognition</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In the machine learning approaches of natural la nguage processing (NLP), models are generally trained on large annotated corpus. However, annotating such corpus is expensive and timeconsuming, which makes it difficult to adapt an existing model to a new domain. In order to overcome this difficulty, active learning (sample sele ction) has been studied in more and more NLP applications such as POS tagging (Engelson and Dagan 1999), information extraction (Thompson et al. 1999), text classif ication (Lewis and Catlett 1994; McCallum and Nigam 1998; Schohn and Cohn 2000; Tong and Koller 2000; Brinker 2003), statistical parsing (Thompson et al. 1999; Tang et al. 2002; Steedman et al. 2003), noun phrase chunking (Ngai and Yarowsky 2000), etc.</Paragraph>
    <Paragraph position="1"> Active learning is based on the assumption that a small number of annotated examples and a large number of unannotated examples are available.</Paragraph>
    <Paragraph position="2"> This assumption is valid in most NLP tasks. Different from supervised learning in which the entire corpus are labeled manually, active learning is to select the most useful example for labe ling and add the labeled example to training set to retrain model.</Paragraph>
    <Paragraph position="3"> This procedure is repeated until the model achieves a certain level of performance. Practically, a batch of examples are selected at a time, called batched-based sample sele ction (Lewis and Catlett 1994) since it is time consuming to retrain the model if only one new example is added to the training set.</Paragraph>
    <Paragraph position="4"> Many existing work in the area focus on two approaches: certainty-based methods (Thompson et al. 1999; Tang et al. 2002; Schohn and Cohn 2000; Tong and Koller 2000; Brinker 2003) and committee-based methods (McCallum and Nigam 1998; Engelson and Dagan 1999; Ngai and Yarowsky 2000) to select the most informative examples for which the current model are most uncertain.</Paragraph>
    <Paragraph position="5"> Being the first piece of work on active learning for name entity recognition (NER) task, we target to minimize the human annotation efforts yet still reaching the same level of performance as a supervised learning approach. For this purpose, we make a more comprehensive consideration on the contribution of individual examples, and more importantly maximizing the contrib ution of a batch based on three criteria : informativeness, representativeness and diversity.</Paragraph>
    <Paragraph position="6"> First, we propose three scoring functions to quantify the informativeness of an example , which can be used to select the most uncertain examples.</Paragraph>
    <Paragraph position="7"> Second, the representativeness measure is further proposed to choose the example s representing the majority. Third, we propose two diversity considerations (global and local) to avoid repetition among the examples of a batch. Finally, two combination strategies with the above three criteria are proposed to reach the maximum effectiveness on active learning for NER.</Paragraph>
    <Paragraph position="8"> We build our NER model using Support Vector Machines (SVM). The experiment shows that our active learning methods achieve a promising result in this NER task. The results in both MUC-6 and GENIA show that the amount of the labeled training data can be reduced by at least 80% without degrading the quality of the named entity recognizer. The contributions not only come from the above measures, but also the two sample selection strategies which effectively incorporate informativeness, representativeness and diversity criteria. To our knowledge, it is the first work on considering the three criteria all together for active learning. Furthermore, such measures and strategies can be easily adapted to other active learning tasks as well.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML