File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/p97-1056_intro.xml

Size: 1,424 bytes

Last Modified: 2025-10-06 14:06:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="P97-1056">
  <Title>Memory-Based Learning: Using Similarity for Smoothing</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Statistical approaches to disambiguation offer the advantage of making the most likely decision on the basis of available evidence. For this purpose a large number of probabilities has to be estimated from a training corpus. However, many possible conditioning events are not present in the training data, yielding zero Maximum Likelihood (ML) estimates. This motivates the need for smoothing methods, which re-estimate the probabilities of low-count events from more reliable estimates.</Paragraph>
    <Paragraph position="1"> Inductive generalization from observed to new data lies at the heart of machine-learning approaches to disambiguation. In Memory-Based Learning 1 (MBL) induction is based on the use of similarity (Stanfill &amp; Waltz, 1986; Aha et al., 1991; Cardie, 1994; Daelemans, 1995). In this paper we describe how the use of similarity between patterns embodies a solution to the sparse data problem, how it  We illustrate the analysis by applying MBL to two tasks where combination of information sources promises to bring improved performance: PP-attachment disambiguation and Part of Speech tagging. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML