File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1127_intro.xml

Size: 2,570 bytes

Last Modified: 2025-10-06 14:02:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1127">
  <Title>Cross-lingual Information Extraction System Evaluation</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Research in information extraction (IE) and its related fields has led to a wide range of applications in many domains. The portability issue of IE systems across different domains, however, remains a serious challenge. This problem is being addressed through automatic knowledge acquisition methods, such as unsupervised learning for domain-specific lexicons (Lin et al., 2003) and extraction patterns (Yangarber, 2003), which require the user to provide only a small set of lexical items of the target classes or extraction patterns for the target domain.</Paragraph>
    <Paragraph position="1"> The idea of a self-customizing IE system emerged recently with the improvement of pattern acquisition techniques (Sudo et al., 2003b), where the IE system customizes itself across domains given by the user's query.</Paragraph>
    <Paragraph position="2"> Furthermore, there are demands for access to information in languages different from the user's own. However, it is more challenging to provide an IE system where the target language (here, English) is different from the source language (here, Japanese): a cross-lingual information extraction (CLIE) system.</Paragraph>
    <Paragraph position="3"> In this research, we explore various methods for efficient automatic pattern acquisition for the CLIE system, including the translation of the entire source document set into the target language.</Paragraph>
    <Paragraph position="4"> To achieve efficiency, the resulting CLIE system should (1) provide a reasonable level of extraction performance (both accuracy and coverage) and (2) require little or no knowledge on the user's part of the source language. Today, there are basic linguistics tools available for many major languages. We show how we can take advantage of the tools available for the source language to boost extraction performance. null The rest of this paper is organized as follows.</Paragraph>
    <Paragraph position="5"> Section 2 and 3 discuss the self-adaptive CLIE system we assess throughout the paper. In Section 4, we show the experimental result for entity detection. Section 5 discusses the problems in translation that affect the pattern acquisition and Section 6 discusses related work. Finally, we conclude the paper in Section 7 with future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML