File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/w04-3218_abstr.xml

Size: 1,621 bytes

Last Modified: 2025-10-06 13:44:06

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-3218">
  <Title>Mining Spoken Dialogue Corpora for System Evaluation and Modeling</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> We are interested in the problem of modeling and evaluating spoken language systems in the context of human-machine dialogs. Spoken dialog corpora allow for a multidimensional analysis of speech recognition and language understanding models of dialog systems. Therefore language models can be directly trained based either on the dialog history or its equivalence class (or cluster). In this paper we propose an algorithm to mine dialog traces which exhibit similar patterns and are identi ed by the same class. For this purpose we apply data clustering methods to large human-machine spoken dialogue corpora. The resulting clusters can be used for system evaluation and language modeling. By clustering dialog traces we expect to learn about the behavior of the system with regards to not only the automation rate but the nature of the interaction (e.g. easy vs di cult dialogs). The equivalence classes can also be used in order to automatically adapt the language model, the understanding module and the dialogue strategy to better t the kind of interaction detected. This paper investigates different ways for encoding dialogues into multi-dimensional structures and di erent clustering methods. Preliminary results are given for cluster interpretation and dynamic model adaptation using the clusters obtained.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML