File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/n04-3009_metho.xml
Size: 10,367 bytes
Last Modified: 2025-10-06 14:08:54
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-3009"> <Title>Spoken Dialogue for Simulation Control and Conversational Tutoring</Title> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 Technical Content </SectionTitle> <Paragraph position="0"> This demonstration shows a flexible tutoring system for studying the eff ects of different tutoring strategies enhanced by a spoken language interface.</Paragraph> <Paragraph position="1"> The hypothesis is that spoken language increases the effectiveness of automated tutoring. Our focus is on the SCoT -DC spoken language tutor for Navy damage control; however, because SCoT - DC performs reflective tutoring on DC - Train simulator sessions, we have also developed a speech interface for the existing DC - Train damage control simulator, to promote ease of use as well as consistency of interface.</Paragraph> <Paragraph position="2"> Our tutor is developed wi thin the Architecture for Conversational Intelligence (Lemon et al. 2001). We use the Open Agent Architecture (Martin et al. 1999) for communication between agents based on the Nuance speech recognizer, the Gemini natural language system (Dowding et al. 1 993), and Festival speech synthesis. Our tutor adds its own dialogue manager agent, for general principles of conversational intelligence, and a tutor agent, which uses tutoring strategies and tactics to plan out an appropriate review and react to the stud ent's answers to questions and desired topics.</Paragraph> <Paragraph position="3"> The SCoT -DC tutor, in Socratic style, asks questions rather than giving explanations. The tutor has a repertoire of hinting tactics to deploy in response to student answers to questions, and identifies and discusses repeated mistakes. The student is able to ask &quot;why&quot; questions after certain tutor explanations, and to alter the tutorial plan by requesting that the tutor skip discussion of certain topics. In DC - Train, the system uses several windows to provi de information graphically, in addition to the spoken messages. In SCoT - DC, the Ship Display from DC - Train is used for both multimodal input and output.</Paragraph> <Paragraph position="4"> Both DC - Train and SCoT - DC use the same overall Gemini grammar, with distinct top - level grammars prod ucing appropriate subsets for each application. Our Gemini grammar currently has 166 grammar rules and 811 distinct words. In a Nuance language model compiled from the Gemini grammar (Moore 1998), different top - level grammars are used in SCoT - DC to enhanc e speech recognition based on expected answers.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Performance Assessment </SectionTitle> <Paragraph position="0"> Experiments to assess the effectiveness of SCoT -DC tutoring are underway in March 2004, with 15 subjects currently scheduled. In July 2003, students in the Repair Locker Head class a t the Navy Fleet Training Center in San Diego ran 12 sessions with DC -Train. Sessions ranged from 1 to 65 user utterances, with an average of 21. The average utterance length was 7 words. In speech recognition, about 22% of utterances were rejected, and the sentences with a recognition hypothesis had a word error rate of 27%. The transcribed data, combined with developer test run data, gave us 327 unique out of -grammar sentences. Of these, we found 79 examples where the automatic Nuance endpointing cut off an utterance too early, and 20 examples of disfluent speech. 118 sentences were determined to be potentially useful phrasings to add to the grammar, while 73 sentences were found to lie outside the scope of the application.</Paragraph> <Paragraph position="1"> To address these issues, w e have added new phrasings to the grammar. We also intend to use Nuance's Listen & Learn offline grammar adaptation tool, to give higher probabilities to likely sentences while retaining broad grammar - based coverage. We may also adjust endpointing time, based on partial speech recognition hypothesis, to give extra time to the kinds of sentences typically occurring with more internal pauses. Disfluencies may decrease as users become more familiar with DC - Train and SCoT - DC during the comparatively longer us e expected from each user in a typical tutoring session The graphical interface for the DC - Train simulator is shown in Figure 1.</Paragraph> <Paragraph position="2"> Each window on the screen is modeled on a source of information available to a real - life DCA on a ship, including as a detailed drawing of the several hundred compartments on the ship, a record of all communications to and from the DCA, a hazard detection panel showing the locations of alarms which have occurred, and a panel showing the firema in, i.e. the pipes carrying water throughout the ship, and the valves and pumps controlling the flow of the water. The window depicting heads represents the other personnel in the same room as the DCA, who are available to receive and transmit messages.</Paragraph> <Paragraph position="3"> While in the original version of DC - Train, the DCA's orders and communications to other personnel on the ship took place through a menu system, this demo presents the newer spoken dialogue interface. Spoken commands take the form of actual Navy commands, thus enabling the Navy student to train in the same manner as they would perform these duties through radio communications on a ship.</Paragraph> <Paragraph position="4"> The user clicks a button to begin speaking, and the speech is recognized by Nuance, using a grammar -based language model a utomatically derived from the Gemini grammar used for parsing and interpretation of the commands. A dialogue manager then maps the Gemini logical forms into DC -Train commands. To allow the student to monitor the success of the speech recognizer, the text of the utterance is displayed. Responses from the simulated personnel are spoken by Festival speech synthesis, and also displayed as text on the screen. Most spoken interactions with DC - Train involve the student DCA giving single commands without any use of dialogue structure; however, the system will query the student for missing required parameters of commmands, such as the repair team who is to perform the action, or the number of the pump to start on the firemain. If the student does not respond to these queries, the system will provide the context of the command missing the parameter as part of a more informative request. The student retains the ability to issue other commands at this time, and need not respond to the system if there is a more pres sing crisis elsewhere.</Paragraph> <Paragraph position="5"> At the end of a DC -Train session, the student can then receive customized feedback and tutoring from SCoT - DC, based on a record of the student's actions compared to what an expert DCA would have done at each point, based on rules ac counting for the state of the simulation. The goal of the tutorial interaction is to identify and remediate any gaps in the student's understanding of damage control doctrine, and to improve the student's performance in issuing the correct commands without hesitation.</Paragraph> <Paragraph position="6"> The graphical interface to the SCoT - DC tutor is shown in Figure 2.</Paragraph> <Paragraph position="7"> from DC -Train, seen in Figure 3, one to give an overall view of the ship and one to zoom in o n affected compartments, with color indicating the type of crisis in a compartment and the state of damage control there. The student can click on a compartment in the Ship Display as a way of indicating that compartment to the system. The automated tuto r and the student communicate through speech, while the lower window displays the text of both sides of the interaction, and permits the user to scroll back through the entire tutorial session. As in DC - Train, the student clicks to begin speaking, then Nuance speech recognition provides a string of words to be interpreted by a Gemini grammar. Also as in DC - Train, responses from the tutor are s ynthesized by Festival, although the tutor speaks with a more natural voice provided by FestVox limited domain synthesis, in which large units of the tutor's utterances may be taken from prompts recorded for this application.</Paragraph> <Paragraph position="8"> Interpretation of the Gemini interpreted forms is handled by a more complex dialogue manager in SCoT - DC than in DC -Train, with a structured representation of the dialogue, which is used to guide the system's use of discourse markers, among other things. The dialogue is mainly driven by the tutor agent's strategies, though the student can request to move on to future topics without completing the current discussion, and also ask a &quot;Why&quot; question after some explanations.</Paragraph> <Paragraph position="9"> Tutorial strategies generally guide the overall path of the conv ersation, such as choosing which crises to discuss based on the errors made by the student. Tutorial tactics apply at a lower - level throughout the dialogue, for example, when a student gives an incorrect answer, the tutor will give a general hint and repos e the question. If the student answers incorrectly a second time the tutor will give a more specific hint and ask the question again. If the student fails a third time the tutor will give the correct answer, and proceed.</Paragraph> <Paragraph position="10"> Running a full DC - Train scenari o takes 20 40 minutes, and has the flavor of the following excerpt: [buzzing alarm goes off, it is a fire alarm] boundaries on primary forward 78, primary af t 126, secondary forward 42, secondary aft 174, above 1, below 2.</Paragraph> <Paragraph position="11"> A reflective dialogue with the tutor will takes around 10 minutes. The following gives a sample of the kind of tutorial interaction.</Paragraph> <Paragraph position="12"> Tutor : Hello, we are about to review your session fro m earlier today.</Paragraph> <Paragraph position="13"> and electrically isolate the compartment.</Paragraph> <Paragraph position="14"> A video clip of an older version of the ScoT - DC system is available at http://www csli.stanford.edu/semlab/muri/November2002Demo.h null tml</Paragraph> </Section> class="xml-element"></Paper>