File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/p05-3015_intro.xml

Size: 2,403 bytes

Last Modified: 2025-10-06 14:03:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="P05-3015">
  <Title>Syntax-based Semi-Supervised Named Entity Tagging</Title>
  <Section position="4" start_page="0" end_page="57" type="intro">
    <SectionTitle>
2 Previous Works and Our Approach
</SectionTitle>
    <Paragraph position="0"> Supervised NE Tagging has been studied extensively over the past decade (Bikel et al. 1999, Baluja et. al. 1999, Tjong Kim Sang and De Meulder 2003). Recently, there were increasing interests in semi-supervised learning approaches.</Paragraph>
    <Paragraph position="1"> Most relevant to our study, Collins and Singer (1999) showed that a NE Classifier can be developed by bootstrapping from a small amount of labeled examples. To extract potentially useful training examples, they first parsed the sentences and looked for expressions that satisfy two constituency patterns (appositives and prepositional phrases). A small subset of these expressions was then manually labeled with their correct NE tags.</Paragraph>
    <Paragraph position="2"> The training examples were a combination of the labeled and unlabeled data. In their studies,  Collins and Singer compared several learning models using this style of semi-supervised training. Their results were encouraging, and their studies raised additional questions. First, are there other appropriate syntactic extraction patterns in addition to appositives and prepositional phrases? Second, because the test data were extracted in the same manner as the training data in their experiments, the characteristics of the test cases were biased. In this paper we examine the question of how well a semi-supervised system can classify arbitrary named entities. In our empirical study, in addition to the constituency features proposed by Collins and Singer, we introduce a new set of dependency parse features to recognize and classify NEs. We evaluated the effects of these two sets of syntactic features on the accuracy of the classification both separately and in a combined form (union of the two sets).</Paragraph>
    <Paragraph position="3"> Figure 1 represents a general overview of our system's architecture which includes the following two levels: NE Recognizer and NE Classifier.</Paragraph>
    <Paragraph position="4"> Section 3 and 4 describes these two levels in details and section 5 covers the results of the evaluation of our system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML