File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/05/p05-1044_abstr.xml

Size: 1,161 bytes

Last Modified: 2025-10-06 13:44:25

<?xml version="1.0" standalone="yes"?>
<Paper uid="P05-1044">
  <Title>Contrastive Estimation: Training Log-Linear Models on Unlabeled Data[?]</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Conditional random fields (Lafferty et al., 2001) are quite effective at sequence labeling tasks like shallow parsing (Sha and Pereira, 2003) and named-entity extraction (McCallum and Li, 2003). CRFs are log-linear, allowing the incorporation of arbitrary features into the model. To train on unlabeled data, we require unsupervised estimation methods for log-linear models; few exist. We describe a novel approach, contrastive estimation. We show that the new technique can be intuitively understood as exploiting implicit negative evidence and is computationally efficient. Applied to a sequence labeling problem--POS tagging given a tagging dictionary and unlabeled text--contrastive estimation outperforms EM (with the same feature set), is more robust to degradations of the dictionary, and can largely recover by modeling additional features.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML