File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/n01-1028_intro.xml
Size: 3,533 bytes
Last Modified: 2025-10-06 14:01:12
<?xml version="1.0" standalone="yes"?> <Paper uid="N01-1028"> <Title>Learning optimal dialogue management rules by using reinforcement learning and inductive logic programming</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> As dialogue systems become ubiquitous, dialogue management strategies are receiving more and more attention. They de ne the system behavior and mainly determine how well or badly it is perceived by users. Generic methodologies exist for developing and testing management strategies. Many of these take a user-centric approach based on Wizard of Oz studies and iterative design (Bernsen et al., 1998).</Paragraph> <Paragraph position="1"> However there are still no precise guidelines about when to use speci c techniques such as mixed-initiative. Reinforcement learning has been used in several recent approaches to search for the optimal dialogue management strategy for speci c dialogue situations (Levin and Pieraccini, 1997; Litman et al., 2000; Singh et al., 2000; Walker, 2000). In these approaches, a dialogue is seen as a walk through a series of states, from an initial state when the dialogue begins until a terminal state when the dialogue ends.</Paragraph> <Paragraph position="2"> The actions of the dialogue manager as well as those of the user in uence the transitions between states. Each transition is associated with a reward, which expresses how good or bad it was to make that transition. A dialogue strategy is then seen as a Markov Decision Process (Levin et al., 1998). Reinforcement learning can be used in this framework to search for an optimal strategy, i.e., a strategy that makes the expected sum of rewards maximal for all the training dialogues. The main idea behind reinforcement learning is to explore the space of possible dialogues and select the strategy which optimizes the expected rewards (Mitchell, 1997, ch. 13). Once the optimal strategy has been found, it can be implemented in the nal system. null Reinforcement learning is state-based. It nds out which action to take next given the current state. This makes the explanation of the strategy relatively hard and limits its potential re-use to other dialogue situations. It is quite di cult to nd out whether generic lessons can be learned from the optimal strategy. In this paper we use inductive logic programming (ILP) to learn sets of rules that generalize the optimal strategy. We show that these can be simpler to interpret than the decision tables given by reinforcement learning and can help modify and re-use the strategies. This is important as human dialogue designers are usually ultimately in charge of writing and changing the strategies.</Paragraph> <Paragraph position="3"> The paper is organized as follows. We rst describe in section 2 a simple dialogue system which we use as an example throughout the paper. In section 3, we present our method and results on using ILP to generalize the optimal dialogue management strategy found by reinforcement learning. We also investigate the use of the rules learned during the search of the optimal strategy. We show that, in some cases, the number of dialogues needed to obtain the optimal strategy can be dramatically reduced.</Paragraph> <Paragraph position="4"> Section 4 presents our current results on this aspect. We compare our approach to other current pieces of work in section 5 and conclude the paper in section 6.</Paragraph> </Section> class="xml-element"></Paper>