File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/98/w98-1210_abstr.xml
Size: 747 bytes
Last Modified: 2025-10-06 13:49:41
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-1210"> <Title>Finding Structure via Compression</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> A statistical language model may be used to segment a data sequence by thresholding its instantaneous entropy. In this paper we describe how this process works, and we apply it to the problem of discovering separator symbols in a text. Our results show that language models which bootstrap themselves with structure found in this way undergo a reduction in perplexity. We conclude that these techniques may be useful in the design of generic grammatical inference systems.</Paragraph> </Section> class="xml-element"></Paper>