File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-2051_intro.xml

Size: 2,258 bytes

Last Modified: 2025-10-06 14:03:43

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-2051">
  <Title>Spontaneous Speech Understanding for Robust Multi-Modal Human-Robot Communication</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Over the past years interest in mobile robot applications has increased. One aim is to allow for intuitive interaction with a personal robot which is based on the idea that people want to communicate in a natural way (Breazeal et al., 2004)(Dautenhahn, 2004). Although often people use speech as the main modality, they tend to revert to additional modalities such as gestures and mimics in face-to-face situations. Also, they refer to objects 1This work has been supported by the European Union within the 'Cognitive Robot Companion' (COGNIRON) project (FP6-IST-002020) and by the German Research Foundation within the Graduate Program 'Task Oriented Communication'.</Paragraph>
    <Paragraph position="1"> in the physical environment. Furthermore, speech, gestures and information of the environment are used in combination in instructions for the robot.</Paragraph>
    <Paragraph position="2"> When participants perceive a shared environment and act in it we call this communication situated (Milde et al., 1997). In addition to these features that are characteristic for situated communication, situated dialog systems have to deal with several problems caused by spontaneous speech phenomena like ellipses, indirect speech acts or incomplete sentences. Large pauses or breaks occur inside an utterance and people tend to correct themselves. Utterances often do not follow a standard grammar as written text.</Paragraph>
    <Paragraph position="3"> Service robots have not only to be able to cope with this special kind of communication but they also have to cope with noise that is produced by their own actuators or the environment. Speech recognition in such scenarios is a complex and difcult task, leading to severe degradations of the recognition performance. The goal of this paper is to present a framework for human-robot interaction (HRI) that enables robust interpretation of utterances under the speci c conditions in HRI.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML