File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/92/p92-1028_concl.xml

Size: 4,090 bytes

Last Modified: 2025-10-06 13:56:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="P92-1028">
  <Title>CORPUS-BASED ACQUISITION OF RELATIVE PRONOUN DISAMBIGUATION HEURISTICS</Title>
  <Section position="11" start_page="220" end_page="222" type="concl">
    <SectionTitle>
5 CONCLUSIONS
</SectionTitle>
    <Paragraph position="0"> We have described an automated approach for the acquisition of relative pronoun disambiguation heuristics that duplicates the performance of handceded rules. Unfortunately, extending the technique for use with unrestricted texts may be difficult. The UMass/MUC-3 parser would clearly need additional mechanisms to handle the ensuing part of speech and 7Other parsing errors occurred throughout the training set, but only those instances where the antecedent was not recognized as a constituent (and the wh-word had an anteceden0 were discarded.</Paragraph>
    <Paragraph position="1"> 8Interestingly, in work on the automated classification of nouns, (Hindle, 1990) also noted problems with &amp;quot;empty&amp;quot; words that depend on their complements for meaning.</Paragraph>
    <Paragraph position="2">  word sense disambiguation problems. However, recent research in these areas indicates that automated approaches for these tasks may be feasible (see, for example, (Brown, Della Pietra, Della Pietra, &amp; Mercer, 1991) and (l-Iindle, 1983)). In addition, although our simple semantic feature set seems adequate for the current relative pronoun disambiguntion task, it is doubtful that a single semantic feature set can be used across all domains and for all disambignation tasks. 9 In related work on pronoun disambig~_~_afion, Dagan and Itai (1991) successfully use statistical cooccurrence patterns to choose among the syntactically valid pronoun referents posed by the parser. Their approach is similar in that the statistical database depends on parser output.</Paragraph>
    <Paragraph position="3"> However, it differs in a variety of ways. First, human intervention is required not to specify the correct pronoun antecedent, but to check that the complete parse tree supplied by the parser for each training example is correct and to rule out potential examples that are inappropriate for their approach.</Paragraph>
    <Paragraph position="4"> More importantly, their method requires very large COrlxra of data.</Paragraph>
    <Paragraph position="5"> Our technique, on the other hand, requires few training examples because each training instance is not word-based, but created from higher-level parser output. 10 Therefore, unlike other corpus-based techniques, our approach is practical for use with small to medium-sized corpora in relatively narrow domains. ((Dagan &amp; Itai, 1991) mention the use of semantic feature-based cooccurrences as one way to make use of a smaller corpus.) In addition, because human intervention is required only to specify the antecedent during the training phase, creating disambiguation heuristics for a new domain requires little effort. Any NLP system that uses semantic features for describing nouns and has minimal syntactic parsing capabilities can generate the required training instances. The parser need only recognize noun phrases, verbs, and prepositional phrases because the disambiguation heuristics, not the parser, are responsible for recognizing the conjunctions and appositives that comprise a relative pronoun antecedent. Moreover, the success of the approach for structurally complex antecedents suggests that the technique may provide a general approach for the 9 In recent work on the disambiguation of structurally, but not semantically, restricted phrases, however, a set of 16 predefined semantic categories sufficed (Ravin, 1990).</Paragraph>
    <Paragraph position="6"> 10Although further work is needed to determine the optimal number of training examples, it is probably the case that many fewer than 170 instances were required even for the experiments described here.</Paragraph>
    <Paragraph position="7">  automated acquisition of disambiguation rules for other problems in natural language processing.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML