File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/w02-0702_intro.xml

Size: 3,843 bytes

Last Modified: 2025-10-06 14:01:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0702">
  <Title>Topic Detection Based on Dialogue History</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In recent years, speech-to-speech translation systems have been developed that integrate three components: speech recognition, machine translation, and speech synthesis (Watanabe et al., 2000). However, these systems cannot guarantee accurate translation because the individual components do not always provide correct results. To overcome this restriction, we proposed a method to use parallel text based translation for supporting free-style sentence translation. In addition, we built a prototype automatic interpretation system for Japanese overseas travelers (Ikeda et al., 2002). With this system, the user searches for an appropriate sentence in source language from the registered parallel text by using the criteria of an utterance, a scene, and a situation, and then uses the target language sentence for a translation.</Paragraph>
    <Paragraph position="1"> Although parallel text based translation provides guaranteed translation results, it has two problems as the user searches for the sentence. One is difficulty in searching an appropriate sentence from user's short utterance, which is often heard in travel conversation.</Paragraph>
    <Paragraph position="2"> Short phrases provide only a few keywords and make the search result too broad. Specifying the exact scene and action helps narrow down the result, but the task may cause user frustration in having to select the right option from the vast categories of scenes and actions.</Paragraph>
    <Paragraph position="3"> The other problem is existence of nonadaptive sentences that may be inappropriate in some of the scenes. Users usually select sentences according to the scenes so they can exclude those inapplicable sentences, but some new users may accidentally select those nonadaptive sentences by failing to specify a scene.</Paragraph>
    <Paragraph position="4"> Here, we propose a method to detect a topic for each utterance. We define a topic as corresponding to a scene that is a place or a situation in which the user converses. The proposed method is based on the k-nearest neighbor method, which is improved for dialogue utterances by clustering training data and using dialogue history. We use the detected topic for specifying a scene condition in parallel text based translation, and thereby solve the two problems described above.</Paragraph>
    <Paragraph position="5"> Detecting topics also helps improve accuracy of the automatic interpretation system by disambiguating polysemy. Some words should be translated into different words according to the scene and context selection. Topic detection can enhance speech recognition accuracy by selecting the correct word Association for Computational Linguistics.</Paragraph>
    <Paragraph position="6"> Algorithms and Systems, Philadelphia, July 2002, pp. 9-14. Proceedings of the Workshop on Speech-to-Speech Translation: dictionary and resources, which are organized according to the topic.</Paragraph>
    <Paragraph position="7"> The remainder of this paper is organized as follows. Section 2 describes the constraints in detecting a topic from dialogue utterances.</Paragraph>
    <Paragraph position="8"> Section 3 describes our topic detection algorithm to overcome these constraints. Section 4 explains the evaluation of our method by using a travel conversation corpus and Section 5 presents the evaluation result. Section 6 discusses the effect of our method from a comparison of the results on typical dialogue data and on real situation dialogue data. We conclude in Section 7 with some final remarks and mention of future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML