File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-2304_concl.xml

Size: 2,679 bytes

Last Modified: 2025-10-06 13:54:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2304">
  <Title>The NICE Fairy-tale Game System 1</Title>
  <Section position="8" start_page="1" end_page="1" type="concl">
    <SectionTitle>
6 Evaluation issues
</SectionTitle>
    <Paragraph position="0"> Task-oriented spoken dialogue systems are usually evaluated in terms of objective and subjective features. Objective criteria include the technical robustness and core functionality of the system components as well as system performance measures such as task completion rate. Subjective usability evaluations estimate features like naturalness and quality of the interactions, as well as user satisfaction reported in post-experimental interviews. However, many of these measures are simply not relevant for entertainment-type applications, where user satisfaction increases rather than decreases with task completion time. It can even be difficult to define what the completion of the task would be. In practice, computer games are usually evaluated by professional game reviewers and by the users in terms of number of copies sold.</Paragraph>
    <Paragraph position="1"> In the evaluation of the NICE fairy-tale game sales figures will not be possible to use, and several of the traditional objective measures are less relevant due to the domain. Instead, subjective measures involving features like &amp;quot;narrative progression&amp;quot;, &amp;quot;character believability&amp;quot;, and &amp;quot;entertainment value&amp;quot;, will be used. They will be obtained off-line, by interviewing the users after their interactions and asking them to fill out questionnaires. Users will be asked how they perceived the quality of the actual interaction, as well as the personality of the fairy-tale characters. Expert evaluators, who will be able to replay the user interactions and inspect the system logs, will also be employed. Examples of evaluation questions to the experts include: &amp;quot;Do the characters display meaningful roles and believable personalities that contribute to the story?&amp;quot;, &amp;quot;Do they succeed in signaling their level of understanding&amp;quot;, &amp;quot;To what extent is the user able to affect the plot&amp;quot;? In order to be able to replay the user interactions with the fairy-tale system, all communication between the system modules are logged with time stamps. This will be a valuable tool both in the iterative system development and for system evaluations. At present, we are in the process of collecting data with the introductory game scenario. The data collected will be used to develop the subsequent scenarios in the fairy-tale game.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML