File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-0202_intro.xml

Size: 2,291 bytes

Last Modified: 2025-10-06 14:06:23

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0202">
  <Title>Experience in WordNet Sense Tagging in the Wall Street Journal</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> This paper reports on our experience hand tagging the senses of 25 of the most frequent verbs in 12,925 sentences of the Wall Street Journal Treebank corpus (Marcus et al. 1993). The purpose of this work is to support related work in automatic word-sense disambiguation.</Paragraph>
    <Paragraph position="1"> The verbs are tagged with respect to senses in WordNet (Miller 1990), which has become widely used, for example in corpus-annotation projects (Miller et al. 1994, Ng &amp; Hian 1996, and Grishman et al. 1994) and for performing disambiguation (Resnik 1995 and Leacock et ai. 1993).</Paragraph>
    <Paragraph position="2"> The verbs to tag were chosen on the basis of how frequently they occur in the text, how wide their range of senses, and how distinguishable the senses are from one another.</Paragraph>
    <Paragraph position="3"> In related work, we have begun to tag nouns and adjectives as well. These are being chosen additionally on the basis of co-occurrence with the verbs already tagged, to support approaches such as (Hirst 1987), in which word-sense ambiguities are resolved with respect to one another.</Paragraph>
    <Paragraph position="4"> Some of the chosen verbs can function as both main and auxiliary verbs, and some are often used in idioms. In this paper, we suggest consistently representing these as separate subclasses.</Paragraph>
    <Paragraph position="5"> We apply a preprocessor to the data, which automatically identifies some classes of verb occurrence with good accuracy. This facilitates manual annotation, because it is easier to fix a moderate number of errors than to tag the verbs completely from scratch. The preprocessor performs other miscellaneous tasks to aide in the tagging task, such as separating out punctuation marks and contractions.</Paragraph>
    <Paragraph position="6"> At the end of the paper, we share some strategies from our coding instructions for recognizing idioms, and show some challenging ambiguities we found in the data.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML