File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/00/p00-1073_abstr.xml

Size: 767 bytes

Last Modified: 2025-10-06 13:41:44

<?xml version="1.0" standalone="yes"?>
<Paper uid="P00-1073">
  <Title>Distribution-Based Pruning of Backoff Language Models</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> We propose a distribution-based pruning of n-gram backoff language models. Instead of the conventional approach of pruning n-grams that are infrequent in training data, we prune n-grams that are likely to be infrequent in a new document. Our method is based on the n-gram distribution i.e. the probability that an n-gram occurs in a new document. Experimental results show that our method performed 7-9% (word perplexity reduction) better than conventional cutoff methods.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML