File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-2219_intro.xml
Size: 3,744 bytes
Last Modified: 2025-10-06 14:06:39
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2219"> <Title>Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email</Title> <Section position="2" start_page="0" end_page="1345" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> This paper describes a novel method by which a dialogue agent can learn to choose an optimal dialogue strategy. The main problem for dialogue agents is deciding what information to communicate to a hearer and how and when to communicate it. For example, consider one of the strategy choices faced by a spoken dialogue agent that accesses email by phone. When multiple messages match the user's query, e.g. Read my messages from Kim, an email agent must choose among multiple response strategies. The agent might choose the Read-First strategy in DI: (D1) A: In the messages from Kim, there's 1 message about &quot;Interviewing Antonio&quot; and 1 message about &quot;Meeting Today:' The first message is titled, &quot;Interviewing Antonio:' It says, 'Td like to interview him. I could also go along to lunch.</Paragraph> <Paragraph position="1"> Kim:' D1 involves summarizing all the messages from Kim, and then taking the initiative to read the first one. Alternate strategies are the Read-Summary-Only strategy in D2, where the agent provides information that allows users to refine their selection criteria, and the Read-Choice-Prompt strategy in D3, where the agent explicitly tells the user what to say in order to refine the selection: (D2) A: In the messages from Kim, there's 1 message about &quot;Interviewing Antonio&quot; and 1 message about &quot;Meeting Today:' (D3) A: In the messages from Kim, there's 1 message about &quot;Interviewing Antonio&quot; and 1 message about &quot;Meeting Today:' To hear the messages, say, &quot;Interviewing Antonio&quot; or &quot;Meeting.&quot; Decision theoretic planning can be applied to the problem of choosing among strategies, by associating a utility U with each strategy (action) choice and by positing that agents should adhere to the An optimal action is one that maximizes the expected utility of outcome states.</Paragraph> <Paragraph position="2"> An agent acts optimally by choosing a strategy a in state Si that maximizes U(Si). But how are the utility values U(Si) for each dialogue state Si derived? Several reinforcement learning algorithms based on dynamic programming specify a way to calculate U(Si) in terms of the utility of a successor state Sj (Bellman, 1957; Watkins, 1989; Sutton, 1991; Barto et al., 1995). Thus if we know the utility for the final state of the dialogue, we can calculate the utilities for all the earlier states. However, until recently there as been no way of determining a performance function for assigning a utility to the final state of a dialogue.</Paragraph> <Paragraph position="3"> This paper presents a method based on dynamic programming by which dialogue agents can learn to optimize their choice of dialogue strategies. We draw on the recently proposed PARADISE evaluation framework (Walker et al., 1997) to identify the important performance factors and to provide a performance function for calculating the utility of the final state of a dialogue. We illustrate our method with a dialogue agent named ELVIS (EmaiL Voice Interactive System), that supports access to email over the phone. We test alternate strategies for agent initiative, for reading messages, and for summarizing email folders. We report results from modeling a corpus of 232 spoken dialogues in which ELVIS conversed with human users to carry out a set of email tasks.</Paragraph> </Section> class="xml-element"></Paper>