File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/a00-2028_concl.xml

Size: 3,030 bytes

Last Modified: 2025-10-06 13:52:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-2028">
  <Title>Learning to Predict Problematic Situations in a Spoken Dialogue System: Experiments with How May I Help You?</Title>
  <Section position="9" start_page="215" end_page="216" type="concl">
    <SectionTitle>
5 Discussion and Future Work
</SectionTitle>
    <Paragraph position="0"> In summary, our results show that: (1) All feature sets significantly improve over the baseline; (2) Using automatic features from the whole dialogue, we can identify problematic dialogues 23% better than the baseline; (3) Just the first exchange provides sig- null nificantly better prediction (8%) than the baseline; (4) The second exchange provides an additional significant (7%) improvement, (5) A classifier based on task-independent automatic features performs with less than 1% degradation in error rate relative to the automatic features. Even with current accuracy rates, the improved ability to predict problematic dialogues means that it may be possible to field the system without human agent oversight, and we expect to be able to improve these results.</Paragraph>
    <Paragraph position="1"> The research reported here is the first that we know of to automatically analyze a corpus of logs from a spoken dialogue system for the purpose of learning to predict problematic situations. Our work builds on earlier research on learning to identify dialogues in which the user experienced poor speech recognizer performance (Litman et al., 1999). However, that work was based on a much smaller set of experimental dialogues where the notion of a good or bad dialogue was automatically approximated rather than being labelled by humans. In addition, because that work was based on features synthesized over the entire dialogues, the hypotheses that were learned could not be used for prediction during runtime.</Paragraph>
    <Paragraph position="2"> We are exploring several ways to improve the performance of and test the problematic dialogue predictor. First, we noted above the extent to which the hand-labelled feature rsuccess improves classifier performance. In other work we report results from training an rsuccess classifier on a per-utterance level (Walker et al., 2000), where we show that we can achieve 85% accuracy using only fully automatic features. In future work we intend to use the (noisy) output from this classifier as input to our problematic dialogue classifier with the hope of improving the performance of the fully automatic feature sets.</Paragraph>
    <Paragraph position="3"> In addition, since it is more important to minimize errors in predicting PROBLEMATIC dialogues than errors in predicting TASKSUCCESS dialogues, we intend to experiment with RIPPER'S loss ratio parameter, which instructs RIPPER to achieve high accuracy for the PROBLEMATIC class, while potentially reducing overall accuracy. Finally, we plan to integrate the learned rulesets into the HMIHY dialogue system to improve the system's overall performance.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML