File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/h93-1068_metho.xml

Size: 7,801 bytes

Last Modified: 2025-10-06 14:13:25

<?xml version="1.0" standalone="yes"?>
<Paper uid="H93-1068">
  <Title>PERCEIVED PROSODIC BOUNDARIES AND THEIR PHONETIC CORRELATES</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> Any two successive words may vary as to their syntactic or semantic cohesiveness. The latter is likely to be stronger if the two words are part of the same linguistic constituent; conversely, the occurrence of a constituent boundary between words decreases their degree of cohesiveness. For example, in the sentence &amp;quot;(the man) (is sitting) (in the chair)&amp;quot;, any two words separated by round brackets are structurally farther apart than those within a pair of brackets. Speakers are capable of making the juncture between constituents audible by prosodic means: they may produce appropriate cues in terms of pause, pitch and duration parameters. Listeners, on the other hand, can make use of these cues to segment the incoming flow of speech into word sequences that may be treated as a whole, which facilitates the comprehension process. In certain cases, prosodic demarcation may help in resolving structural ambiguity, for instance in utterances of the type &amp;quot;The girl saw the man with the telescope&amp;quot;, in which the prepositional phrase specifies either the verb or its direct object \[1, 2\]. But in utterances containing no surface syntactic homonymy, too, prosodic boundaries may delineate coherent word groups and lend support to the listener's hypotheses about syntactic-semantic structure as, for instance, in &amp;quot;the beautiful girl / with brown eyes / told her story / to the psychiatrist&amp;quot; \[3\].</Paragraph>
    <Paragraph position="1"> This paper presents results of research that, starting from the observation that listeners do provide prosodic boundary cues, addresses two main questions:  (a) Can listeners assign a value of Perceived Boundary Strength (PBS) to word boundaries? (b) If so, what is the relationship between PBS and dif null ferent (combinations of) suprasegmental features? The answer to these questions may lead to a better model of what prosodic resources a speaker can draw on to highlight the syntactic-semantic structure of an utterance. Such insight may, in turn, contribute to improved prosody in speech synthesis, by making it sound more natural and -more importantly- by making it linguistically more transparent and therefore easier to comprehend. This research may also shed light on how a listener makes use of the demarcative information encoded in prosodic features. In that respect, it has relevance for (knowledgebased) automatic speech recognition, where the inclusion of prosodic information may support the syntactic-semantic parse of the input, especially if the latter contains structural ambiguities.</Paragraph>
    <Paragraph position="2"> This line of research is in agreement with the growing interest in the communicative function of prosody, which may contain information not only about utterance-internal phrasing, as already suggested above \[4\], but also about the topical organization of discourse in monologues and dialogues \[5\] or about speaker-dependent features such as emotional state \[6\].</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="342" type="metho">
    <SectionTitle>
2. EXPERIMENTAL APPROACH
</SectionTitle>
    <Paragraph position="0"> In this section we present part of the results obtained in an experiment that aimed at answering the two questions mentioned in the introduction. 'Ib this effect we have collected appropriate speech material, in which we asked listeners to score the PBS of each word boundary. Subsequently, the material was subjected to various phonetic analyses, the results of which were then correlated with the PBS's. Finally, the predictions of an algorithm that assigns prosodic structure to unmarked text were verified against the PBS's.</Paragraph>
    <Paragraph position="1">  PBS: , ,x,, 1.7 3.5 8.0 3.8 . 3.3 t delexical .... &amp;quot;If &amp;quot;w- &amp;quot;wl~'~IW ~~ll~v ~~ a, r~ intelligente, onderzoeker reserveercle via dq telefax ~ text .c Iizat i.~:~:G=:, ..--&amp;quot;-. ~..</Paragraph>
    <Paragraph position="2"> 10C ~ .............</Paragraph>
    <Paragraph position="3"> .................. 1o111 IIA ........ pitch 1 &amp;A 2 = contour / ~ transcription declination line declination reset l~gu~ 1: (A) PBS values and 03) results of the phonetic, analyses for one of the test utterances of the professional speaker.</Paragraph>
    <Section position="1" start_page="341" end_page="341" type="sub_section">
      <SectionTitle>
2.1. Speech Material
</SectionTitle>
      <Paragraph position="0"> A set of twenty Dutch sentences was constructed, which differed sufficiently in length and complexity to warrant the occurrence of prosodic boundaries of varying strengths. These sentences contained a total of 175 word boundaries. This set was read out by three native speakers: two males, of whom one was a professional speaker, and one female. To evaluate the possible influence of syntactic and semantic information, all 20 utterances spoken by the professional speaker and 3 of the utterances spoken by the other two were processed in such a way that the contents of the utterances was rendered unintelligible, while the prosodic features were kept intact. In this way, a so-called 'delexicalized' version of the test material was created m addition to the 'normal' version.</Paragraph>
    </Section>
    <Section position="2" start_page="341" end_page="341" type="sub_section">
      <SectionTitle>
2.2. PBS Assignment
</SectionTitle>
      <Paragraph position="0"> In a number of successive sessions, nineteen listeners were confronted with the 3 x 20 = 60 utterances in the normal version and the (1 x 20) + (2x 3) = 26 in the delexicalized version. They were asked to indicate, on a 10-point scale, how strong they felt the juncture at each word boundary to be. The mean of the nineteen scores per word boundary was taken as a measure of the perceptual boundary strength (PBS) of that word boundary.</Paragraph>
      <Paragraph position="1"> Thus, the PBS was obtained for each word boundary as produced by each of the speakers in each test version.</Paragraph>
      <Paragraph position="2"> An example of the PBS values obtained for one of the test utterances is shown in Figure la. As can be seen, listeners appear to be quite capable of distinguishing a diversity of PBS values, both in the lexical and delexical conditions.</Paragraph>
    </Section>
    <Section position="3" start_page="341" end_page="342" type="sub_section">
      <SectionTitle>
2.3 Phonetic Analysis
</SectionTitle>
      <Paragraph position="0"> The acoustic / phonetic analysis of the material concentrated on the speakers' use of pauses and intonation to highlight word boundaries. It was determined for each word boundary in the 60 utterances 1) whether there was a pause and, if so, of what length; 2) whether there was melodic discontinuity across the boundary and, if so, of  what type; and 3) whether the boundary was associated with a declination reset.</Paragraph>
      <Paragraph position="1"> The location and length of pauses were determined by straightforward inspection of the wavetbrms. Melodic transcriptions of the 60 utterances were obtained by a combination of pitch measurement, pitch stylization and independent perceptual evaluation by experts, l,'ollowing the typology outlined in 't llart et al. (171, p.81), lbur types of melodic discontinuity were distinguished in the way the speakers marked the word boundaries: deg10, 1E, 12, IA2'.</Paragraph>
      <Paragraph position="2"> Figure lb presenls a survey of the results of the phonetic analyses for one of the test utterances.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML