File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/p06-1061_abstr.xml

Size: 1,287 bytes

Last Modified: 2025-10-06 13:44:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1061">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Segment-based Hidden Markov Models for Information Extraction</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Hidden Markov models (HMMs) are powerful statistical models that have found successful applications in Information Extraction (IE). In current approaches to applying HMMs to IE, an HMM is used to model text at the document level. This modelling might cause undesired redundancy in extraction in the sense that more than one filler is identified and extracted.</Paragraph>
    <Paragraph position="1"> We propose to use HMMs to model text at the segment level, in which the extraction process consists of two steps: a segment retrieval step followed by an extraction step. In order to retrieve extraction-relevant segments from documents, we introduce a method to use HMMs to model and retrieve segments. Our experimental results show that the resulting segment HMM IE system not only achieves near zero extraction redundancy, but also has better overall extraction performance than traditional document HMM IE systems.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML