XML Viewer - w00-0306

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/w00-0306_concl.xml
Size: 2,336 bytes
Last Modified: 2025-10-06 13:52:50
<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-0306">
  <Title>Stochastic Language Generation for Spoken Dialogue Systems</Title>
  <Section position="5" start_page="30" end_page="30" type="concl">
    <SectionTitle>
4 Conclusion
</SectionTitle>
    <Paragraph position="0"> We have presented a new approach to language generation for spoken dialogue systems. For content planning, we built a simple bigram model of attributes, and found that, in our first implementation, it performs as well as a heuristic of old vs. new information. For surface realization, we used an n-gram language model to stochastically generate each utterance and found that the stochastic system performs at least as well as the template-based system.</Paragraph>
    <Paragraph position="1"> Our stochastic generation system has several advantages. One of those, an important issue for spoken dialogue systems, is the response time.</Paragraph>
    <Paragraph position="2"> With stochastic surface realization, the average generation time for the longest utterance class (10 - 20 words long) is about 200 milliseconds, which is much faster than any rule-based systems. Another advantage is that by using a corpus-based approach, we are directly mimicking the language of a real domain expert, rather than attempting to model it by rule.</Paragraph>
    <Paragraph position="3"> Corpus collection is usually the first step in building a dialogue system, so we are leveraging the effort rather than creating more work. This also means adapting this approach to new domains and even new languages will be relatively simple.</Paragraph>
    <Paragraph position="4"> The approach we present does require some amount of knowledge engineering, though this appears to overlap with work needed for other parts of the dialogue system. First, defining the class of utterance and the attribute-value pairs requires care. Second, tagging the human-human corpus with the right classes and attributes requires effort. However, we believe the tagging effort is much less difficult than knowledge acquisition for most rule-based systems or even template-based systems. Finally, what may sound right for a human speaker may sound awkward for a computer, but we believe that mimicking a human, especially a domain expert, is the best we can do, at least for now.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML