File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/w02-0806_intro.xml

Size: 1,724 bytes

Last Modified: 2025-10-06 14:01:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0806">
  <Title>Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> This paper presents a post-mortem analysis of the English and Spanish lexical sample tasks of SENSEVAL-2. Two closely related questions are considered. First, to what extent did the competing systems agree? Did systems tend to be redundant and have success with many of the same test instances, or were they complementary and able to disambiguate different portions of the instance space? Second, how much did the difficulty of the test instances vary? Are there test instances that proved unusually difficult to disambiguate relative to other instances? We address the first question via a series of pair-wise comparisons among the participating systems that measures their agreement via the kappa statistic. We also introduce a simple measure of the degree to which systems are complementary called optimal combination. We analyze the second question by rating the difficulty of test instances relative to the number of systems that were able to disambiguate them correctly.</Paragraph>
    <Paragraph position="1"> Nearly all systems that received official scores in the Spanish and English lexical sample tasks of SENSEVAL-2 are included in this study. There are 23 systems included from the English lexical sample task and eight from the Spanish. Table 1 lists the systems and shows the number of test instances that each disambiguated correctly, both by part of speech and in total.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML