File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-2301_concl.xml

Size: 5,528 bytes

Last Modified: 2025-10-06 13:54:25

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2301">
  <Title>Usability and Acceptability Studies of Conversational Virtual Human Technology</Title>
  <Section position="8" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6 Conclusions and Lessons Learned
</SectionTitle>
    <Paragraph position="0"> In this paper we describe usability and acceptance data derived from a number of studies using a number of different RVHT applications. No data suggest that our applications are completely accessible yet to these users, but the data in aggregate suggest we are moving in the right direction.</Paragraph>
    <Paragraph position="1"> The different studies involved various user groups, from experts (medical clinicians) to novices (field and telephone survey interviewers) to &amp;quot;common folk&amp;quot; (exhibit visitors) in greatly different domains. A common finding was for our participants to suggest additional potential audiences, also ranging from novice to expert.</Paragraph>
    <Paragraph position="2"> Further, the majority of participants said they enjoyed using the applications - and/or were observed to be engaged with the virtual characters - despite technical obstacles, prototype-stage content, and conspicuous presence of the investigators.</Paragraph>
    <Paragraph position="3"> Some specific lessons learned include: * It is critical in applications to be able to detect and respond appropriately to &amp;quot;bad&amp;quot; or inappropriate input. In all our applications, users (often but not always intentionally) spoke utterances that were outside the range of what was expected in the context of the dialog. This occurred most frequently in the tradeshow exhibit application where users would try to test the limits of the system. But we even found that in the training applications that users would often express frustration by cursing or otherwise verbally mistreating the virtual character.</Paragraph>
    <Paragraph position="4"> * Without explicit prompting by the virtual character, users often seemed lost as to what to say next. We found that explicit statements or questions by the virtual character helped to supply the user with the necessary context. This also helped to prune the language processing space. . In the ExhibitAR domain, a subset of possible relevant questions was always present on the screen.</Paragraph>
    <Paragraph position="5"> * Because of shortcomings in speech recognition technology, we found that typed input was often needed to overcome the limitations of large grammars. This was particularly true in the more open-ended pediatric trainer. We also found typed input to be invaluable in development stage even in applications that were ultimately going to be speech-driven. The typed inputs in development helped us to derive grammars that we could later use to improve the speech recognizer. null * Our greatest difficulties in understanding the system occurred when the user replied with very complex compound sentences, multiple sentence, and even paragraph long utterances. This phenomenon led us to set user expectations in the training environment prior to their using the system. null * Anecdotally we found that pre-recorded speech was much more acceptable than any currently available speech synthesizer. This effect seemed to be less noticeable the longer the user spoke with the system. We would like to conduct a study comparing the use of the two technologies.</Paragraph>
    <Paragraph position="6"> * Ultimately, because of the limitations in language understanding, the user would adapt to environment, adjusting the manner in which they spoke.</Paragraph>
    <Paragraph position="7"> We are encouraged by results so far, but feel it is important to continue to investigate more robust and effective RVHT models and more efficient means of creating the models, to better understand user preferences and acceptance of RVHT, and to determine how best to use RVHT in combination with other approaches to produce cost-effective training, assessment, and other applications. We propose several areas of active research: null * Usability and acceptability studies across different populations. Are there differences in acceptance of virtual characters across boundaries of age, gender, education level, and cultural divides? null * Usability and acceptability studies with varied input modes. What are the tradeoffs between using a typed natural language interface versus a spoken interface? We found that a typed interface improved the computer's ability to comprehend the user which leads to more cohesive dialog. On the other hand, a typed interface reduces the naturalness of the dialog, the believability of the character, and the usability of the system.</Paragraph>
    <Paragraph position="8"> * Usability and acceptability studies with varied degrees of visual realism. How realistic do virtual characters have to be in order to receive high ratings of acceptability by users? What is the contrast in user impressions between video of actual humans versus more cartoon-like animated characters? * Usability and acceptability studies with multi-modal input. Currently our systems make no attempt to use the user's vocal affect, facial expressions, eye movement, body gesture, or other physiological input (such as galvanic skin response) in interpreting the user's emotional state and intentions. We would like to introduce these elements into our systems to assess whether such input can create more realistic characters.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML