XML Viewer - w97-1016

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-1016_intro.xml
Size: 5,163 bytes
Last Modified: 2025-10-06 14:06:27
<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1016">
  <Title>Resolving PP attachment Ambiguities with Memory-Based Learning</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> A central issue in natural language analysis is structural ambiguity resolution. A sentence is structurally ambiguous when it can be assigned more than one syntactic structure. The drosophila of structural ambiguity resolution is Prepositional Phrase (PP) attachment. Several sources of information can be used to resolve PP attachment ambiguity. Psycholinguistic theories have resulted in disambiguation strategies which use syntactic information only, i.e. structural properties of the parse tree are used to choose between different attachment sites. Two principles based on syntactic information are Minimal Attachment (MA) and Late Closure (LC) (Frazier, 1979). MA tries to construct the parse tree that has the fewest nodes, whereas LC tries to attach new constituents as low in the parse tree as possible. These strategies always choose the same attachment regardless of the lexical content of the sentence. This results in a wrong attachment in one of the following sentences:  In sentence 1, the PP &amp;quot;with a fork&amp;quot; is attached to the verb &amp;quot;eats&amp;quot; (high attachment). Sentence 2 differs only minimally from the first sentence; here, the PP &amp;quot;with anchovies&amp;quot; does not attach to the verb but to the NP &amp;quot;pizza&amp;quot; (low attachment). In languages like English and Dutch, in which there is very little overt case marking, syntactic information alone does not suffice to explain the difference in attachment sites between such sentences. The use of syntactic principles makes it necessary to re-analyse the sentence, using semantic or even pragmatic information, to reach the correct decision. In the example sentences 1 and 2, the meaning of the head of the object of 'with' determines low or high attachment. Several semantic criteria have been worked out to resolve structural ambiguities. However, pinning down the semantic properties of all the words is laborious and expensive, and is only feasible in a very restricted domain. The modeling of pragmatic inference seems to be even more difficult in a computational system.</Paragraph>
    <Paragraph position="1"> Due to the difficulties with the modeling of semantic strategies for ambiguity resolution, an attractive alternative is to look at the statistics of word patterns in annotated corpora. In such a corpus, different kinds of information used to resolve attachment ambiguity are, implicitly, represented in co-occurrence regularities. Several statistical techniques can use this information in learning attachment ambiguity resolution.</Paragraph>
    <Paragraph position="2"> Hindle and Rooth (1993) were the first to show that a corpus-based approach to PP attachment ambiguity resolution can lead to good results. For sentences with a verb~noun attachment ambiguity, they measured the lexical association between the noun and the preposition, and the verb and the preposition in unambiguous sentences. Their method bases attachment decisions on the ratio and Zavrcl, Daelemans ~4 Veenstra 136 Memory-Based PP Attachment Jakub Zavrel, Walter Daelemans and Jorn Veenstra (1997) Resolving PP attachment Ambiguities with Memory-Based Learning. In T.M. EUison (ed.) CoNLL97: Computational Natural Language Learning, ACL ~ 136-144. 1997 Association for Computational Linguistics reliability of these association strengths. Note that Hindle and Rooth did not include information about the second noun and therefore could not distinguish between sentence 1' and 2. Their method is also difficult to extend to'more elaborate combinations of information sources.</Paragraph>
    <Paragraph position="3"> More recently, a number of statistical methods better suited to larger numbers of features have been proposed for PP-attachment. Brill and Resnik (1994) appl!ed Error-Driven Transformation-Based Learning, Ratnaparkhi, Reynar and Roukos (1994) applied a Maximum Entropy model, Franz (1996) used a Loglinear model, and Collins and Brooks (1995) obtained good results using a Back-Off model.</Paragraph>
    <Paragraph position="4"> In this paper, we examine whether Memory-Based Learning (MBL), a family of statistical methods from the field of Machine Learning, can improve on the performance of previous approaches. Memory-Based Learning is described in Section 2. In order to make a fair comparison, we evaluated our methods on the common benchmark dataset first used in Ratnaparkhi, Reynar, and Roukos (1994). in section 3, the experiments with our method on this data are described. An important advantage of MBL is its use of similarity-based reasoning. This makes it suited to the use of various unconventional representations of word patterns (Section 2). In Section 3 a comparison is provided between two promising representational forms, Section 4 contains a comparison of our method to previous work, and we conclude with section 5.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML