File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/97/j97-1006_abstr.xml
Size: 8,044 bytes
Last Modified: 2025-10-06 13:48:50
<?xml version="1.0" standalone="yes"?> <Paper uid="J97-1006"> <Title>Smith and Gordon Human-Computer Dialogue INPUTS OUTP\[.q ~ Current Computer Goal Current User Focus Dialog Mode Computer Response Selection Algorithm Selected Task Goal</Title> <Section position="2" start_page="0" end_page="142" type="abstr"> <SectionTitle> 1. Modeling Human-Computer Dialogue </SectionTitle> <Paragraph position="0"> It is generally acknowledged that developing a successful computational model of interactive natural language (NL) dialogue requires extensive analysis of sample dialogues. Previous work has included analyses of (1) human-human dialogues in relevant task domains; (2) Wizard-of-Oz dialogues in which a human (the Wizard) simulates the role of the computer as a way of testing out an initial model; and (3) human-computer dialogues based on initial implementations of computational models. Each of these dialogue types has advantages as a model for system building, in terms of the relevance of the data to the final model. However, each also has particular disadvantages when researchers attempt to generalize from the findings of previous work.</Paragraph> <Paragraph position="1"> For example, much analysis of human-human interactions has been done, such as Walker and Whittaker's (1990) analysis of mixed initiative in dialogue, or Oviatt and Cohen's (1991) comparison of interactive and non-interactive spoken modalities.</Paragraph> <Paragraph position="2"> Analyses of human-human dialogues are a good basis for an initial task model and a lexicon, but it is difficult to determine which aspects of these analyses will generalize to human-computer dialogues and which ones will not. Fraser and Gilbert (1991) note that &quot;although it is certainly better to rely on analyses of human-human interactions than to rely on intuitions alone, the fact remains that human-human interactions are not the same as human-computer interactions and it would be surprising if they followed precisely the same rules&quot; (p. 81). In addition, Oviatt and Cohen (1991) say that &quot;... to model discourse accurately for interactive systems further research clearly will be needed on the extent to which human-computer speech differs from that between humans. At present, there is no well developed model of human-machine communi* Department of Mathematics, Greenville, NC 27858, USA. First author's e-mail: rws@cs.ecu.edu (~ 1997 Association for Computational Linguistics Computational Linguistics Volume 23, Number 1 cation ...&quot; (p. 323). The dilemma of researchers is nicely summarized by Fraser and Gilbert: &quot;The designer is caught in a vicious circle it is necessary to know the characteristics of dialogues between people and automata in order to be able to build the system, but it is impossible to know what such dialogues would be like until such a system has been built&quot; (p. 81).</Paragraph> <Paragraph position="3"> Wizard-of-Oz (WOZ) dialogues result from an experimental technique that is one way of addressing this dilemma. In this methodology, human subjects are told they are interacting with a computer when they are really interacting with another human (the Wizard) who simulates the performance of the computer system. In some simulations (e.g., Whittaker and Stenton \[1989\]), the Wizard simulates the entire system while in other cases (e.g., Dahlb~ick, J6nsson, and Ahrenberg \[1993\]), the Wizard makes use of partially implemented systems to assist in responding. Consequently, many initial models can be prototyped and tested before implementation, and researchers need not have a fully developed natural language interface. As other researchers have noted (Whittaker and Stenton 1989; Dahlback, J6nsson, and Ahrenberg 1993; Fraser and Gilbert 1991), when the WOZ simulations are convincing, they obtain data that are a more accurate predictor of actual human-computer interaction than human-human dialogues because speakers adapt to the perceived characteristics of their conversational partners. Consequently, WOZ studies can provide an indication of the types of adaptations that humans will make in human-computer interaction. WOZ studies such as the ones cited above have been particularly useful in obtaining data on discourse structure and contextual references. The WOZ study of Moody (1988) on the effects of restricted vocabulary on interactive spoken dialogue provided the data that influenced the development of our own system.</Paragraph> <Paragraph position="4"> While much knowledge can be gained from WOZ studies, they are not an adequate means of studying all elements of human-computer natural language dialogue. A simulation is feasible as long as humans can use their own problem-solving skills in carrying out the simulation, but when it requires mimicking a proposed algorithm, the WOZ technique becomes impractical. For example, it is difficult to simulate and test the computer's error recovery strategies for speech recognition or natural language understanding errors, because the natural language understanding of the computer is only a simulation. If we wish to test an actual computational model for natural language processing, its complexity demands the construction of a computer program to execute it. Furthermore, an important feature of dialogue that is difficult to simulate via the WOZ paradigm is that of initiative. Depending on the interaction environment, dialogue initiative may reside with the computer, with the user, or may change during the interaction. Lacking any formal models of initiative, it would be very difficult for a Wizard to accurately simulate the response patterns a computerized conversational participant would produce in a mixed-initiative dialogue for a nontrivial domain that would be consistent from subject to subject.</Paragraph> <Paragraph position="5"> Unfortunately, we can also have difficulties generalizing from analyses of human-computer dialogues, because parameters of the particular system with which the dialogues were collected may have significantly affected the resulting dialogues. For example, if a particular system is always run with a particular speech recognizer, it may be difficult to determine what the outcome would have been with a better speech recognizer. Similarly, most human-computer dialogues are collected from systems with a particular dialogue model. Since it is well known that users adapt to the system, it will be unclear how the results from a particular set of human-computer dialogues generalize to a model of interaction based on a different dialogue model.</Paragraph> <Paragraph position="6"> This paper reports work that attempts to address both of these dilemmas through the analysis of human-computer dialogues collected in an environment in which Smith and Gordon Human-Computer Dialogue aspects of the system are parameterizable. We have built an integrated dialogue-processing system, the Circuit Fix-It Shop, which is parameterized for a key system behavior: initiative. 1 We have tested the system in 141 dialogues totaling 2,840 user utterances while varying levels of system initiative. The paper discusses our model of initiative and presents quantitative results from an analysis of our corpus on the effect of the computer's level of initiative on aspects of human-computer dialogue structure such as (1) utterance classification into subdialogues, (2) frequency of user-initiated subdialogue transitions, (3) regularity of subdialogue transitions, (4) frequency of linguistic control shifts, and (5) frequency of user-initiated error corrections. The results indicate there are differences in user behavior and dialogue structure as a function of the computer's level of initiative. Furthermore, they provide evidence that a spoken natural language dialogue system must be capable of varying its level of initiative in order to facilitate effective interaction with users of varying levels of expertise and experience.</Paragraph> </Section> class="xml-element"></Paper>