File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/c00-2101_intro.xml

Size: 1,995 bytes

Last Modified: 2025-10-06 14:00:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-2101">
  <Title>Learning Semantic-Level Information Extraction Rules by Type-Oriented ILP</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Information extraction (IE) tasks in this paper involve the MUC-3 style IE. The input for the information extraction task is an empty template and a set of natural language texts that describe a restricted target domain, such as corporate mergers or terrorist atta.cks in South America. Templates have a record-like data structure with slots that have names, e.g., &amp;quot;company name&amp;quot; and &amp;quot;merger d~te&amp;quot;, and v~lues. The output is a set of filled templates. IE tasks are highly domain-dependent, so rules and dictionaries for filling values in the telnp\]ate slots depend on the domain.</Paragraph>
    <Paragraph position="1"> it is a heavy burden for IE system developers that such systems depend on hand-made rules, which cannot be easily constructed and changed. For example, Umass/MUC-3 needed about 1,500 person-hours of highly skilled labor to build the IE rules and represent them as a dictionary (Lehnert, 1992). All the rules must be reconstructed i'rom scratch when the target domain is changed.</Paragraph>
    <Paragraph position="2"> To cope with this problem, some pioneers have studied methods for learning information extraction rules (Riloff,1996; Soderland ctal., 1.995; Kim et el., 1995; Huffman, 1996; Califf and Mooney, 1997). Along these lines, our appreach is to a.pply an inductive logic programruing (ILP) (Muggleton, 1991)system to the learning of IE rules, where information is extracted from semantic representations of news articles. The ILP system that we employed is a type-oriented ILP system I{\]\]B + (Sasaki and Haruno, 1997), which can efficiently and effectively h~mdle type (or sort) information in training data.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML