File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/e06-2009_intro.xml

Size: 2,711 bytes

Last Modified: 2025-10-06 14:03:25

<?xml version="1.0" standalone="yes"?>
<Paper uid="E06-2009">
  <Title>An ISU Dialogue System Exhibiting Reinforcement Learning of Dialogue Policies: Generic Slot-filling in the TALK In-car System</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The in-car system described below has been constructed primarily in order to be able to collect data for Reinforcement Learning (RL) approaches to multimodal dialogue management, and also to test and further develop learnt dialogue strategies in a realistic application scenario. For these reasons we have built a system which: a0 containsan interfaceto a dialoguestrategy learner module, a0 covers a realistic domain of useful &amp;quot;in-car&amp;quot; conversation and a wide range of dialogue phenomena (e.g. confirmation, initiative, clarification, information presentation), a0 can be used to complete measurable tasks (i.e.</Paragraph>
    <Paragraph position="1"> there is a measure of successful and unsuccessful dialogues usable as a reward signal for Reinforce- null project.org In this demonstration we will exhibit the software system that we have developed to meet these requirements. First we describe the domain in which the dialogue system operates (an &amp;quot;in-car&amp;quot; information system). Then we describe the major components of the system and giveexamplesof their use. We then discuss the important features of the system in respect to the dialogue phenomena that they support.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.1 A System Exhibiting Reinforcement Learning
</SectionTitle>
      <Paragraph position="0"> The central motivation for building this dialogue system is as a platform for Reinforcement Learning (RL) experiments. The system exhibits RL in 2 ways: a0 It can be run in online learning mode with real users. Herethe RL agentis able to learn fromsuccessfulandunsuccessfuldialogueswithrealusers. null Learningwill be much slowerthan with simulated users, but can start from an already learnt policy, and slowly improve upon that.</Paragraph>
      <Paragraph position="1"> a0 It can be run using an already learnt policy (e.g.</Paragraph>
      <Paragraph position="2"> the one reported in (Henderson et al., 2005; Lemon et al., 2005), learnt from COMMUNICATOR data (Georgila et al., 2005)). This mode can be used to test the learnt policies in interactions with real users.</Paragraph>
      <Paragraph position="3"> Please see (Henderson et al., 2005) for an explanation of the techniques developed for Reinforcement Learning with ISU dialogue systems.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML