File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-1421_abstr.xml

Size: 1,520 bytes

Last Modified: 2025-10-06 13:45:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1421">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Shared-task Evaluations in HLT: Lessons for NLG</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> While natural language generation (NLG) has a strong evaluation tradition, in particular in user-based and task-oriented evaluation, it has never evaluated different approaches and techniques by comparing their performance on the same tasks (shared-task evaluation, STE). NLG is characterised by a lack of consolidation of results, and by isolation from the rest of NLP where STE is now standard. It is, moreover, a shrinking field (state-of-the-art MT and summarisation no longer perform generation as a subtask) which lacks the kind of funding and participation that natural language understanding (NLU) has attracted.</Paragraph>
    <Paragraph position="1"> Evidence from other NLP fields shows that STE campaigns (STECs) can lead to rapid technological progress and substantially increased participation. The past year has seen a groundswell of interest in comparative evaluation among NLG researchers, the first comparative results are being reported (Belz and Reiter, 2006), and the move towards some form of comparative evaluation seems inevitable. In this paper we look at how two decades of NLP STECs might help us decide how best to make this move.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML