File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/w93-0216_metho.xml

Size: 10,675 bytes

Last Modified: 2025-10-06 14:13:32

<?xml version="1.0" standalone="yes"?>
<Paper uid="W93-0216">
  <Title>Empirical Evidence for Intention-based Discourse Segmentation</Title>
  <Section position="2" start_page="0" end_page="60" type="metho">
    <SectionTitle>
2 The Study
</SectionTitle>
    <Paragraph position="0"> Our corpus consists of 20 narrative monologues a,bout tile sa.lue movie (~bout 14,000 words total), taken from Chafe's &amp;quot;Pe;tr Stories&amp;quot; \[Chafe, 1980\]. Seven subjects I)er na.rrative were presented with a verl)a.tim transcript, 1 such that each line of the transcript corresponded to one prosodic phrase (sentence-final or phrase-final contour, see Chafe \[1980\] for details). Subjects were instructed to identify sequential chunks, 2 each representing a single intention. Subjects were also instructed to describe the speaker intention for each discourse segment. Intention was explained in common sense terms and by example. Subjects were restricted to placing boundaries between prosodic phrases. An excerpt from the instructions is shown in Figure 1.</Paragraph>
    <Paragraph position="1"> Figure 2 illustrates a portion of an intention-based segmentation produced by 7 subjects. Distinct subjects are indicated by letters of the alphM)et. Prosodic phrases are numbered sequentially; the iWe eliminated visually distracting material such ;Ls pause locations and durations.</Paragraph>
    <Paragraph position="2"> 2Grosz and Hirschberg \[1992\] previously conducted an empirical study of hierarchicM, intention-based segmentation. We have looked at a simpler linear intention-based segmentation task. Our pilot study a.~ well ms the work of Rotondo \[1984\] indicated that more COml)lex segmentation ta.sks were too cumbersome given our ;Lverage narrative length. Hearst \[1993\] also examines linear segmentation, l)~ed on a notion of topic change.</Paragraph>
    <Paragraph position="3">  You should think of each movie narration as resulting from many decisions made by the speaker about what to do next. You will be asked to evaluate what the speaker was doing at each point ... Read through the transcript and draw a horizontal line acr.ss the page between complete text lines (utterances) where you think the speaker started doing something new.</Paragraph>
    <Paragraph position="4"> ht the wide left hand margin, say in abbreviated form what the speaker is doing .... \[Here\] is an example of how to pr.ceed. You are free to use any criteria in deciding what the naxTator of your transcript is doing.</Paragraph>
    <Paragraph position="5"> speaker recommends Well it's really a great movie, movie really beautiful scelmry.</Paragraph>
    <Paragraph position="6"> You should see it, I recommend it, I really d..</Paragraph>
    <Paragraph position="7"> The first part of the movie just sets up  first fiehl of the phrase number indicates sentence-final contour, and the second indicates phrase-final contour. At each potential boundary site, i.e., between each pair of prosodic phrases, the number and identity of subjects who classified the site as a boundary is indicated. The segmentation shown in Figure 2 contains 1 boundary proposed by all 7 subjects, 1 bounda.ry proposed by 5 subjects, and  speaker describes lack thereof\] Describes that it is a silent movie with only nature sounds Speaker describes sound techniques used in tnovie Explain that there is no speaking ill nlovie</Paragraph>
  </Section>
  <Section position="3" start_page="60" end_page="61" type="metho">
    <SectionTitle>
3 Discourse Segment Boundaries
</SectionTitle>
    <Paragraph position="0"> In \[Passonneau and Litman, 1993\], we show that our subjects agree with one another at levels that are statistically significant, thus demonstrating the reliability of intention as a segmentation criterion. Percent agreement is defined in \[Gale et al., 1992\] as the ratio of observed agreements with the majority opinion to possible agreements with the majority opinion. We use percent agreement to measure the ability of subjects to agree with one anotlter on whether there is ~ segment boundary between two adjacent prosodic phrases. We find that the average agreement across the 20 narratives on the status of all potential boundary locations is 89% (with a range from 82%-92%). We then Use Cochran's test \[Cochran, 1950\] to deternfine if these levels of agreement are statistically significant.</Paragraph>
    <Paragraph position="1"> Cochran's test compares the observed number of subjects placing a boundary at every potential site with the number predicted by a random distribution; it is aSSulned that the total number of boundaries assigned by any one subject is given by that subject's actual pertbrmance. The greater the difference from randomness, the more unlikely is the observed distribution. For the 20 narratives, the probabilities of the observed boundary distributions ranged from p = .1 x 10 -6 to p &lt; .6 x 10 -9, all very highly significant.</Paragraph>
    <Paragraph position="2"> We Mso show why we consider bound~tries agreed upon by a majority of subjects to be empirically validated. By partioning Cochran's statistic, we find the threshold for significance across all subjects and all narratives to be when at least 4 of 7 subjects agree. Using this threshohl, we can derive a single  discourse segmentation for each narrative. For the excerpt in Figure 2, this gives two empirically validated boundaries, represented as ordered pairs of prosodic phrases: (14.1,14.2) and (15.4,16.1).</Paragraph>
  </Section>
  <Section position="4" start_page="61" end_page="61" type="metho">
    <SectionTitle>
4 Discourse Segment Intention
</SectionTitle>
    <Paragraph position="0"> The 5 prosodic phrases from 14.2 through 15.4 constitute a segment, which we represent as \[14.2,15.4\].</Paragraph>
    <Paragraph position="1"> From Figure 2, we see that 5 different subjects i(lentified exactly this segment. Inspection of subjects' descriptions of speaker intention shows that in such cases of agreement on segments, subjects  identifies essentially the same intentional structure. Subject g, who began the segment with 13.1, characterized the intention as when the boy falls no one else cares about him. Subject g thus not only identifies a different segment boundary, but also a different overall purpose.</Paragraph>
    <Paragraph position="2"> Presumably, the 5 subjects depicted in Figure 3 abstract from the th.ct that the three full clauses in the segment all refer to auditory characteristics, as signaled by the lexical items conversation in the first clause, sounds in the second, and say in the third. Here we have generalized further from the subjects annotations to note that each asserts something about the movie's auditory character.</Paragraph>
    <Paragraph position="3"> In general, for segments delimited by high agreement I)oundaries, a single formulation of speaker purl)ose can be generalized from the data provided by the 7 sul)jects. We believe that such data provides evidence that when asked to, subjects perform the same kinds of abstraction across related utterances described in \[Polanyi, 1988\] and elsewhere.</Paragraph>
  </Section>
  <Section position="5" start_page="61" end_page="62" type="metho">
    <SectionTitle>
5 Discourse Segment Pops
</SectionTitle>
    <Paragraph position="0"> In Figure 2, one of tile main characters of tile movie (a boy on a bicycle) is referred to in phrases 13.1 and 14.1. The speaker suspends her description of the boy's activities throughout the next segment (\[14.2,15.4\]), but resumes reference to the boy in the first utterance of the following segment (16.1) using a third person definite pronoun subject (he). This illustrates a specific class of discourse pops, in which a segment resumes an earlier suspended segment. In particular, this class of resumption segments begins with an utterance in which a third person definite pronoun refers to an entity that is not in the focus space \[Grosz and Sidner, 1986\] associated with the intervening segment. Processing a clause that signals this type of discourse pop involves several tasks. Resolving the prononn in the initial clause of the resumption seglnent requires shifting the attentional state \[Grosz and Sidner, 1986\], since the active focus space (correspon(ling to the intervening segment) does not contain a representation of the referent. This shift depends on recognizing the termination of the intervening segment, and a continuation relation between the resulni)tion and suspended segments, so that the entities in focus in the suspended segment are again in focus for the resumption segment.</Paragraph>
    <Paragraph position="1"> There are 8 discourse pops of the type in Figure 2 in the 10 narratives that we have coded for referential relations (coding described in \[Passonneau, 1993\]). These 8 examples exhibit various structural and semantic relations to the presumed discourse mo(lel. For example, they contain intervening segments that provide more detail, provide general background, or are digressions. In 7 cases, the resumption segment begins with a word that can flmction as a, cue word, 3 but in 4 cases the cue word is and, a word whose discom'se usage is hard to distinguish and which provides very :~We a.~sume that different uses of cue l)hra~es can be discriminated; cf. \[Hirschberg and Litman. 1993\].</Paragraph>
    <Paragraph position="2">  little semantic inforlnation. The cue words in the relnaining 3 cases are so, all right and then, none of which clearly signal attentional change \[Grosz and Sidner, 1986; Hirschberg and Litman, 1993\].</Paragraph>
    <Paragraph position="3"> Non-lexical signals (e.g., uh, tsk, false starts) precede the initial clause of the resumption segment in 5 cases, and relatively long pauses (&gt; 1.5 see.) in 6 (cf. \[Hirschberg and Grosz, 1992\]). This suggests that in spontaneous oral discourse, instead of giving explicit indicators of the structural and semantic relations among segments, speakers provide non-lexical and pausal cues to breaks in segmental structure, relying on the hearer to infer the abstract structural and semantic relations.</Paragraph>
    <Paragraph position="4"> For example, the semantics of the I)l'edication fidl over at the onset of the resuml)tion segment is arguably very dissimilar to the predications occurring in the intervening seglnent, possibly supporting the inference that the new clause is unrelated to the intervening segrnent. In contrast, the two clauses which end the suspended segnlent (at 14.1) and begin the resulnption segment (at 16.11 are semantically identical and structurally parallel, supporting the inference that 16.1 resuntes the segment containing 14.1.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML