File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/i05-4006_intro.xml
Size: 2,592 bytes
Last Modified: 2025-10-06 14:03:04
<?xml version="1.0" standalone="yes"?> <Paper uid="I05-4006"> <Title>Construction of Structurally Annotated Spoken Dialogue Corpus</Title> <Section position="2" start_page="0" end_page="40" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> With the improvement of speech processing technologies, spoken dialogue systems that appropriately respond to a user's spontaneous utterances and cooperatively execute a dialogue are desired.</Paragraph> <Paragraph position="1"> It is important for cooperative spoken dialogue systems to understand the intentions of a user's utterances, the purpose of the dialogue, and its achievement state (Litman, 1990). To solve this issue, several approaches have been so far proposed. One of them is an approach in which the system expresses the knowledge of the dialogue with a frame and executes the dialogue according to that frame (Goddeau, 1996; Niimi, 2001; Oku, 2004). However, it is difficult to make a frame that totally defines the content of the dialogue. Additionally, there is a tendency for the dialogue style to be greatly affected by the frame.</Paragraph> <Paragraph position="2"> In this paper, we describe the construction of a structurally annotated spoken dialogue corpus.</Paragraph> <Paragraph position="3"> By statistically dealing with the corpus, we can achieve the automatic acquisition of dialogue-structural rules. We suppose that the system can figure out the state of the dialogue through the incremental building of the dialogue structure.</Paragraph> <Paragraph position="4"> We use the CIAIR in-car spoken dialogue corpus (Kawaguchi, 2004; Kawaguchi, 2005), and describe the dialogue structure as a binary tree.</Paragraph> <Paragraph position="5"> The tree expresses the purpose of partial dialogues and the relations between utterances or partial dialogues. The speaker's intention tags were provided in the transcription of the corpus.</Paragraph> <Paragraph position="6"> We annotated 789 dialogues consisting of 8150 utterances. Due to the advantages of the dialogue- null structural rules being represented by context free grammars, we were able to use an existing technique for natural language processing to reduce the annotation burden.</Paragraph> <Paragraph position="7"> In section 2, we explain the CIAIR in-car spoken dialogue corpus and the speaker's intention tags. In sections 3 and 4, we discuss the design policy of a structurally annotated spoken dialogue corpus and the construction of the corpus. In section 5, we evaluate the corpus.</Paragraph> </Section> class="xml-element"></Paper>