XML Viewer - w97-0210

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-0210_intro.xml
Size: 3,241 bytes
Last Modified: 2025-10-06 14:06:22
<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0210">
  <Title>m m Investigating Complementary Methods for Verb Sense Pruning</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Much work in natural language processing is predicated on the notion that linguistic usage varies sufficiently across different situations of language use that systems can be tailored to a particular sub-language variety (Kittredge and Lehrberger, 1982).</Paragraph>
    <Paragraph position="1"> Biber (1993) presents evidence that a corpus restricted to one or two language registers would exclude &amp;quot;much of the English language&amp;quot; by narrowing the lexicon, verb tense and aspect, and syntactic complexity. Such observations inform the increasing trend towards analysis of homogeneous corpora to identify linguistic constraints for use in systems intended to understand or generate coherent discourse. Recent work in this vein includes identification of lexical constraints from textual tutorial dialogue (Moser and Moore, 1995), constraints on illocutionary act type from spoken task-oriented dialogue (Allen et al., 1995), prosodic constraints from spoken information-seeking monologues (Hirschberg and Nakatani, 1996), and constraints on referring expressions from spoken narrative monologue (Passonneau, 1996). Related work suggests that constraints of different types are interdependent (Biber, 1993; Passonneau and Litman, forthcoming), hence should be investigated together. Our ultimate goal is to develop methods to tag lexical semantic features in discourse corpora in order to enhance extraction of constraints of the sort just listed. Two types of investigations that would undoubtedly be enhanced are explorations of the interrelation of lexical cohesion and global discourse structure (Morris and Hirst, 1991; Hearst, 1994), and identification of lexicaliza-: tion patterns for domain-specific concepts (Robin, 1994).</Paragraph>
    <Paragraph position="2"> In this paper, we propose a two-pronged approach to an initial step in lexical semantic tagging, pruning the search space for polysemous verbs. Rather than attempting to identify unique word senses, we aim for the more realistic goal of pruning sense information. We will then incrementally evaluate the utility of tagging corpora with pruned sense sets for different types of discourse. We begin with verbs on the hypothesis that verb sense distinctions correlate with syntactic properties of verbs (Levin, 1993). Our initial results indicate that domain-independent syntactic information reduces potential verb senses for multiply polysemous verbs (five or more WordNet senses) by more than 50%. In Section 2, we outline our first method, based on domain-independent lexical knowledge, presenting results from an analysis of thousands of verbs. In the section following that, we present our complementary method, a technique utilizing verb clusters automatically computed from corpus data. In the conclusion, we discuss how the combination of the two methods increases the performance of our system and enhances the robustness of the final results.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML