File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-0110_intro.xml

Size: 3,382 bytes

Last Modified: 2025-10-06 14:03:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0110">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Hybrid Models for Chinese Named Entity Recognition</Title>
  <Section position="4" start_page="0" end_page="111" type="intro">
    <SectionTitle>
2 Recognition of Chinese Named Entity
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="72" type="sub_section">
      <SectionTitle>
Using SVM
</SectionTitle>
      <Paragraph position="0"> Firstly, we segment and assign part-of-speech (POS) tags to words in the texts using a Chinese lexical analyzer. Secondly, we break segmented words into characters and assign each character its features. Lastly, a model based on SVM to identify Chinese named entities is set up by choosing a proper kernel function.</Paragraph>
      <Paragraph position="1"> In the following, we will exemplify the person names and location names to illustrate the identification process.</Paragraph>
    </Section>
    <Section position="2" start_page="72" end_page="111" type="sub_section">
      <SectionTitle>
2.1 Support Vector Machines
</SectionTitle>
      <Paragraph position="0"> Support Vector Machines first introduced by Vapnik (1996) are learning systems that use a hypothesis space of linear functions in a high dimensional feature space, trained with a learning algorithm from optimization theory that implements a learning bias derived from statistical theory. SVMs are based on the principle of structural risk minimization. Viewing the data as points in a high-dimensional feature space, the goal is to fit a hyperplane between the positive and negative examples so as to maximize the distance between the data points and the hyperplane. null Given training examples:</Paragraph>
      <Paragraph position="2"> x is a feature vector (n dimension) of the i-th sample.</Paragraph>
      <Paragraph position="3"> is the class (positive(+1) or negative(-1) class) label of the i-th sample. l is the number of the given training samples. SVMs find an &amp;quot;optimal&amp;quot; hyperplane: to separate the training data into two classes. The optimal hyperplane can be found by solving the following quadratic programming problem (we leave the details to Vapnik (1998)):</Paragraph>
      <Paragraph position="5"> (2) The function is called kernel function, is the mapping from primary input space to feature space. Given a test example, its label y is decided by the following function:</Paragraph>
      <Paragraph position="7"> Basically, SVMs are binary classifiers, and can be extended to multi-class classifiers in order to solve multi-class discrimination problems.</Paragraph>
      <Paragraph position="8"> There are two popular methods to extend a binary classification task to that of K classes: one class vs. all others and pairwise. Here, we employ the simple pairwise method. This idea is to build classifiers considering all pairs of classes, and final decision is given by their voting.</Paragraph>
      <Paragraph position="9"> 2/)1( [?]x KK</Paragraph>
    </Section>
    <Section position="3" start_page="111" end_page="111" type="sub_section">
      <SectionTitle>
2.2 Recognition of Chinese Person Names
</SectionTitle>
      <Paragraph position="0"> Based on SVM We use a SVM-based chunker, YamCha (Kudo and Masumoto, 2001), to extract Chinese person names from the Chinese lexical analyzer.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML