File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/00/w00-1005_abstr.xml
Size: 9,217 bytes
Last Modified: 2025-10-06 13:41:47
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1005"> <Title>Identifying Prosodic Indicators of Dialogue Structure: Some Methodological and Theoretical Considerations</Title> <Section position="1" start_page="0" end_page="37" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper presents an empirical analysis of prosodic phenomena (intonation and timing) in 'common ground units' (Nakatani & Traurn 1999). The analysis is used to address questions of the role of prosody in dialogue while taking into account the complexities of multispeaker discourse. We address some methodological concerns of how best to carry out a study of this kind as well as our theoretical questions about the formal identification of dialogue structures at levels higher than the micro-level of 'dialogue act', or 'move'.</Paragraph> <Paragraph position="1"> Introduction This paper reports some of the results from our research into the relationship between prosodic structure and discourse structure in dialogue. One of our particular interests is how to identify and analyse relevant prosodic parameters in multi-speaker discourse. We have been examining the kinds of dialogue structure frameworks that best account for patterns of prosodic phenomena; and conversely, the types of dialogue structure that exhibit prosodic regularities. Our research domain is hurnam human natural dialogue Settings but our questions are equally relevant to researchers working on systems for more naturalistic human-computer interfaces, as well as those developing better automated systems for annotating large speech corpora.</Paragraph> <Paragraph position="2"> The main methodological considerations associated with our current work are: a) how natural dialogues can be reliably annotated to allow independent comparisons and correlations of prosodic and structural features, b) the identification and classification of units of dialogue that reflect the 'joint action' feature of interactive discourse (ie. that both participants in a dialogue contribute to dialogue structure (Clark 1992, 1996), an aspect of dialogue that fundamentally differentiates it from monologic discourse). These issues will be addressed in this paper using data from a corpus of naturally produced spoken dialogue taken from the Australian Map Task corpus (Millar et al 1994). Here we have focussed on the process of grounding, the assignment of utterances to 'common ground units' (CGUs - Nakatani & Traurn 1999) and the internal structure of these units, as a means of illustrating some of the problems that both methodological issues raise. We show how some of these problems might be overcome by focussing on sequences of initiating and responding (typically grounding) contributions within CGUs, as a site for prosodic analysis, rather than on the boundaries of the units as a whole (cf. Stiding et al 2000a). This approach thus preserves the notion that one can identify 'chunks' of dialogue in which particular types of information are acknowledged as being in the common ground of both participants, while remaining true to the dynamic nature of the grounding negotiation.</Paragraph> <Paragraph position="3"> 1. Background 1.1. Prosody and Discourse Structure Most. empirical work examining prosody in discourse has focussed on its function in monologue (eg. Swerts 1997, Nakatani et al 1995, Hirchberg & Nakatani 1996). These studies have found that a range of acoustic parameters associated with prosody, such as final lengthening and type of boundary tone, are good indicators of the boundaries between different discourse units at micro and macro levels of discourse structure.</Paragraph> <Paragraph position="4"> More recently, there has been an interest in examining how prosody may be used in dialogue to signal discourse structure in that domain. Shriberg et al (1998) showed that various prosodic cues (duration, F0, pause length and speaking rate) were relevant for the automatic classification of dialogue 'acts'. Stifling et al (2000b) similarly showed strong correspondences between the boundaries of dialogue acts and prosodic phenomena such as pitch reset and intonational phrase boundaries (represented as ToBI 'break indices'). But dialogue acts are the 'parts' of dialogue most akin with structural elements of monologic discourse, since each 'act' can be analysed as independent utterances by a single speaker.</Paragraph> <Paragraph position="5"> Higher levels of dialogue structure necessarily involve some interactive 'chunk' of the discourse to which both participants in the dialogue contribute some speech.</Paragraph> <Paragraph position="6"> So while it is clear that prosody serves to delimit dialogue acts, and to some extent distinguish between them (eg. Shriberg et al 1998, Koiso et al 1998, Stifling et al 2000b), the question remains whether prosody is also a reliable indicator of dialogue structure at higher levels (analogous with the higher levels of monologic discourse structure described in Swerts (1997), Nakatani et al. (1995), and others) and to what uses are prosodic phenomena put in the context of higher levels of dialogue structure.</Paragraph> <Section position="1" start_page="0" end_page="37" type="sub_section"> <SectionTitle> 1.2 Grounding and Common Ground Units </SectionTitle> <Paragraph position="0"> Grounding is the process by which information contributed by participants in interaction is taken to have entered the 'common ground', or mutual knowledge of the participants (Clark & Schaefer 1989, Clark 1996, Traum 1994). The process of grounding requires that one participant contributes something to the discourse (minimally, a dialogue act), and that the other participant make some indication that the contribution has been heard and accepted as a contribution (though not necessarily understood). This 'indication' may be a verbal acknowledgment (or some other kind of verbal response) or it may be some kind of non-verbal comrnlmicative act (like head nods, facial expression and other gestures).</Paragraph> <Paragraph position="1"> Traum (1998) and Nakatani & Traum (1999) have recently proposed taking grounding as the basic principle behind the structuring of dialogue at levels ligher than the dialogue act. Minimal units of acknowledged common ground have been considered as the building blocks of higher level dialogue structures based on intentional or informational content (eg. 'Common Ground Units', or 'CGUs' Nakatani & Traum (1999)).</Paragraph> <Paragraph position="2"> CGUs, which represent grounding at the 'illocutionary level' (Clark 1996), have been proposed as a meso-level dialogue structure roughly the same level that dialogue games (Carletta et al, 1997) or adjacency pairs (eg.</Paragraph> <Paragraph position="3"> Sinclair & Coulthard 1975) occupy in their dialogue structure frameworks.</Paragraph> <Paragraph position="4"> The appeal of taking units basexl on grounding as the level of dialogue structure above the microlevel of 'act' (as argued in Nakatani & Traurn 1999) lies in its pfiofitization of mutual understanding as a central component of dialogue, regardless of the type of initiation and response. In the 'CGU' fiamework, some responses themselves get grounded so that the result is a complex configuration of overlapping and embedded units of information entered into the common ground of the participants. This approach thus acknowledges importance of the conlributions by both participants in the grounding process. It highlights the 'joint action' aspect of dialogic communication.</Paragraph> <Paragraph position="5"> Evaluation of the coding of CGUs in dialogue by the Discourse Resource Initiative (Core et al 1999) showed a low degree of intercoder reliability, especially for those coding the HCRC Map Task corpus (Anderson et al 1991). Some of the inconsistencies across coders were attributed to &quot;iraonation and ~rfing&quot; (p. 61), as well as to difficulties in coding different types of acknowledgments. Some proposals for the classification of acknowledgments were made (Core et al 1999) and Stirling et al (2000a) have noted some parameters along Which CGUs might be further classified.</Paragraph> <Paragraph position="6"> The ways that this classification has been refined and utilised for the cun, ent paper are described in section 2 (methods), where we also describe our system of annotation for both prosodic and CGU properties of the dialogues, and explain our method for utilising this annotation for the current analysis. In section 3 we present some results of our investigation of prosody and grounding in the light of the methodological issues listed above. These results will be used in section 4 to address the following theoretical questions, as well as to set the agenda for future research into prosodic correlates of dialogue structure: a) can prosody be used as a heuristic for identifying dialogue structure above the level of the 'dialogue act'? b) does the process of 'grounding' have any formal basis, prosodic or otherwise, independent of the identification of dialogue acts?</Paragraph> </Section> </Section> class="xml-element"></Paper>