File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/91/h91-1062_intro.xml

Size: 3,602 bytes

Last Modified: 2025-10-06 14:05:00

<?xml version="1.0" standalone="yes"?>
<Paper uid="H91-1062">
  <Title>A Proposal for Incremental Dialogue</Title>
  <Section position="2" start_page="0" end_page="319" type="intro">
    <SectionTitle>
INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> There is no single dialogue problem. By its nature, dialogue processing is composed of many different capabilities matched to many different aspects of the problem. It is reasonable to expect that dialogue evaluation methodologies should be multffaceted to reflect this richness of structure.</Paragraph>
    <Paragraph position="1"> Ideally, each new addition to the set of evaluation methodologies should test a different aspect of dialogue processing, and should be harder than the methodologies that came before iL We present two suggestions: one which extends the common evaluation procedure in order to test one new aspect of dialogues, and one which modifies the scoring metric.</Paragraph>
    <Paragraph position="2"> Difference:  1. Conversation is cooperative, but a game is competitive. 2. In chess, the goal is clear (checkmate), but in a conversational dialogue, the goal is less dear.</Paragraph>
    <Paragraph position="3"> 3. In a chess game, any state can be completely and concisely  represented by a single board position; in a dialogue it is not known what comprises a state, nor how to represent it.</Paragraph>
    <Paragraph position="4"> Like the game tree for chess, the human/computer dialogue tree is enormous, as indicated in figure 1. There are usually hundreds or thousands of alternatives the human may produce. The number of responses the system can make is much smaller; some responses may be clearly wrong, but seldom is there a single &amp;quot;right&amp;quot; or &amp;quot;best&amp;quot; response (just as there is seldom a single such move in chess). Even when striving for the same goal, two different people are very likely to choose very different paths.</Paragraph>
    <Section position="1" start_page="0" end_page="319" type="sub_section">
      <SectionTitle>
An Analogy with Chess
</SectionTitle>
      <Paragraph position="0"> We as a community have been thinking about dialogue evaluation in terms of whether the systems we are building give the &amp;quot;right&amp;quot; answer (the one the wizard gave, or the one agreed upon by the Principles of Interpretation) at every step. We have been trying to come up with a methodology to measure whether our systems can reproduce the wizard's answers at each step of a lengthy dialogue. But is this a reasonable approach? Participating in a dialogue, whether between two humans or between a human and a machine, bears a striking resemblance to playing a complex game such as chess.</Paragraph>
      <Paragraph position="1"> Similarities:  1. Each involves precise turn-taking.</Paragraph>
      <Paragraph position="2"> 2. There is an extremely large tree of possible &amp;quot;next moves&amp;quot; (the tree for human dialogue, even in a limited domain, is much larger than that for chess).</Paragraph>
      <Paragraph position="3"> 3. Multiple paths through the tree can lead to the same results.  As a community, we have been thinking about dialogue evaluation in terms of whether the system gives the &amp;quot;fight&amp;quot; answer at every step (the one the wizard gave at the same point in the same dialogue). The major problem with this type of thinking is that it encourages us to characterize a move that does not mimic the expert's (or an answer that does not exactly match the wizard's) as wrong, when it may not be wrong at all, but just different.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML