File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/98/w98-1210_abstr.xml

Size: 747 bytes

Last Modified: 2025-10-06 13:49:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-1210">
  <Title>Finding Structure via Compression</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> A statistical language model may be used to segment a data sequence by thresholding its instantaneous entropy. In this paper we describe how this process works, and we apply it to the problem of discovering separator symbols in a text. Our results show that language models which bootstrap themselves with structure found in this way undergo a reduction in perplexity. We conclude that these techniques may be useful in the design of generic grammatical inference systems.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML