File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/95/w95-0112_abstr.xml

Size: 1,447 bytes

Last Modified: 2025-10-06 13:48:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W95-0112">
  <Title>Automatically Acquiring Conceptual Patterns Without an Annotated Corpus</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Previous work on automated dictionary construction for information extraction has relied on annotated text corpora. However, annotating a corpus is time-consuming and difficult.</Paragraph>
    <Paragraph position="1"> We propose that conceptual patterns for information extraction can be acquired automatically using only a preclassified training corpus and no text annotations. We describe a system called AutoSlog-TS, which is a variation of our previous AutoSlog system, that runs exhaustively on an untagged text corpus. Text classification experiments in the MUC-4 terrorism domain show that the AutoSlog-TS dictionary performs comparably to a hand-crafted dictionary, and actually achieves higher precision on one test set. For text classification, AutoSlog-TS requires no manual effort beyond the preclassified training corpus. Additional experiments suggest how a dictionary produced by AutoSlog-TS can be filtered automatically for information extraction tasks. Some manual intervention is still required in this case, but AutoSlog-TS significantly reduces the amount of effort required to create an appropriate training corpus.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML