File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-4004_intro.xml

Size: 2,229 bytes

Last Modified: 2025-10-06 14:03:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-4004">
  <Title>Valido: a Visual Tool for Validating Sense Annotations</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The task of sense annotation consists in the assignment of the appropriate senses to words in context.</Paragraph>
    <Paragraph position="1"> For each word, the senses are chosen with respect to a sense inventory encoded by a reference dictionary. The free availability and, as a result, the massive adoption of WordNet (Fellbaum, 1998) largely contributed to its status of de facto standard in the NLP community. Unfortunately, WordNet is a fine-grained resource, which encodes possibly subtle sense distictions.</Paragraph>
    <Paragraph position="2"> Several studies report an inter-annotator agreement around 70% when using WordNet as a reference sense inventory. For instance, the agreement in the Open Mind Word Expert project (Chklovski and Mihalcea, 2002) was 67:3%. Such a low agreement is only in part due to the inexperience of sense annotators (e.g. volunteers on the web).</Paragraph>
    <Paragraph position="3"> Rather, to a large part it is due to the difficulty in making clear which are the real distinctions between close word senses in the WordNet inventory.</Paragraph>
    <Paragraph position="4"> Adjudicating sense choices, i.e. the task of validating word senses, is therefore critical in building a high-quality data set. The validation task can be defined as follows: let w be a word in a sentence , previously annotated by a set of annotators A = fa1;a2;:::;ang each providing a sense for w, and let SA = fs1;s2;:::;smg Senses(w) be the set of senses chosen for w by the annotators in A, where Senses(w) is the set of senses of w in the reference inventory (e.g. WordNet). A validator is asked to validate, that is to adjudicate a sense s 2 Senses(w) for a word w over the others. Notice that s is a word sense for w in the sense inventory, but is not necessarily in SA, although it is likely to be. Also note that the annotators in A can be either human or automatic, depending upon the purpose of the exercise.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML