File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/w00-1210_metho.xml

Size: 1,654 bytes

Last Modified: 2025-10-06 14:07:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1210">
  <Title>A trainable method for extracting Chinese entity names and their relations</Title>
  <Section position="3" start_page="70" end_page="70" type="metho">
    <SectionTitle>
3. System Evaluation
</SectionTitle>
    <Paragraph position="0"> To test our method we prepare a manually annotated corpus comprised of about 200 business news. All the entity names (about 500 person names and 300 organization names), noun phrases, and relations (i.e. employee-of, product-of, location-of) in the corpus were manually annotated. Ten pairs of training set and testing set were randomly selected from the corpus with each set equivalent to half size of the entire corpus. We ran our learning and extracting processes on all the data sets and calculated the mean recall and precision rates. The results are showed in Table. 1.</Paragraph>
    <Paragraph position="1">  As can been seen, our performance in person name and organization name extraction is comparable to other systems \[2,3\] considenng the relatively small size of the training corpus. Based on our survey, our work on extracting entity relations is unprecedented for Chinese, therefore we are unable to establish a benchmark. But, the extraction of emloyee-of relation looks quite good. Detailed analysis reveals that our method can handle well some instances where co-reference resolution is needed because we introduced cross-sentence features. The method did poorly on product-of relation extraction due to the errors in noun phrases chunking. With a better NP chunking module, the performance can be improved.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML