File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/90/p90-1003_intro.xml

Size: 6,052 bytes

Last Modified: 2025-10-06 14:04:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="P90-1003">
  <Title>PROSODY, SYNTAX AND PARSING</Title>
  <Section position="3" start_page="0" end_page="17" type="intro">
    <SectionTitle>
2 Corpus
</SectionTitle>
    <Paragraph position="0"> For our corpus of sentences we selected a subset of a corpus developed previously (see Price et aL 1989) for investigating the perceptual role of prosodic information in disambiguating sentences. A set of 35 phonetically ambiguous sentence pairs of differing syntactic structure was recorded by professional FM radio news announcers. By phonetically ambiguous sentences, we mean sentences that consist of the same string of phones, i.e., that suprasegmental rather than segmental information is the basis for the distinction between members of the pairs. Members of the pairs were read in disambiguating contexts on days separated by a period of several weeks to avoid exaggeration of the contrast. In the earlier study listeners viewed the two contexts while hearing one member of the pair, and were asked to select the appropriate context for the sentence. The results showed that listeners can, in general, reliably separate phonetically and syntactically ambiguous sentences on the basis of prosody. The original study investigated seven types of structural ambiguity. The present study used a subset of the sentence pairs which contained  prepositional phrase attachment ambiguities, or particle/preposition ambiguities (see Appendix).</Paragraph>
    <Paragraph position="1"> If naive listeners can reliably separate phonetically and structurally ambiguous pairs, what is the basis for this separation? In related work on the perception of prosodic information, trained phoneticians labeled the same sentences with an integer between zero and five inclusive between every two words. These numbers, 'prosodic break indices,' encode the degree of prosodic decoupling of neighboring words, the larger the number, the more of a gap or break between the words. We found that we could label such break indices with good agreement within and across labelers.</Paragraph>
    <Paragraph position="2"> In addition, we found that these indices quite often disambiguated the sentence pairs, as illustrated below. null * Marge 0 would 1 never 2 deal 0 in 2 any 0 guys * Marge 1 would 0 never 0 deal 3 in 0 any 0 guise The break indices between 'deal' and 'in' provide a clear indication in this case whether the verb is 'deal-in' or just 'deal.' The larger of the two indices, 3, indicates that in that sentence, 'in' is not tightly coupled with 'deal' and hence is not likely to be a particle.</Paragraph>
    <Paragraph position="3"> So far we had established that naive listeners and trained listeners appear to be able to separate such ambiguous sentence pairs on the basis of prosodic information. If we could extract such information automatically perhaps we could make it available to a parser. We found a clue in an effort to assess the phonetic ambiguity of the sentence pairs. We used SRI's DECIPHER speech recognition system, constrained to recognize the correct string of words, to automatically label and time-align the sentences used in the earlier referenced study. The DECIPHER system is particularly well suited to this task because it can model and use very bushy pronunciation networks, accounting for much more detail in pronunciation than other systems. This extra detail makes it better able to time-align the sentences and is a stricter test of phonetic ambiguity. We used the DECIPHER system (Weintraub et al. 1989) to label and time-align the speech, and verified that the sentences were, by this measure as well as by the earlier perceptual verification, truly ambiguous phonetically. This meant that the information separating the member of the pairs was not in the segmental information, but in the suprasegmental information: duration, pitch and pausing. As a byproduct of the labeling and time alignment, we noticed that the durations of the phones could be used to separate members of the pairs. This was easy to see in phonetically ambiguous sentence pairs: normally the structure of duration patterns is obscured by intrinsic duration of phones and the contextual effects of neighboring phones. In the phonetically ambiguous pairs, there was no need to account for these effects in order to see the striking pattern in duration differences. If a human looking at the duration patterns could reliably separate the members of the pairs, there was hope for creating an algorithm to perform the task automatically. This task could not take advantage of such pairs, but would have to face the problem of intrinsic phone duration.</Paragraph>
    <Paragraph position="4"> Word break indices were generated automatically by normalizing phone duration according to estimated mean and variance, and combining the average normalized duration factors of the final syllable coda consonants with a pause factor. Let di = (di- ~j)/o'j be the normalized duration of the ith phoneme in the coda, where pj and ~rj are the mean and standard deviation of duration for phone j. dp is the duration (in ms) of the pause following the word, if any. A set of word break indices are computed for all the words in a sentence as follows:</Paragraph>
    <Paragraph position="6"> The term dp/70 was actually hard-limited at 4, so as not to give pauses too much weight. The set .A includes all coda consonants, but not the vowel nucleus unless the syllable ends in a vowel. Although the vowel nucleus provides some boundary cues, the lengthening associated with prominence can be confounded with boundary lengthening and the algorithm was slightly more reliable without using vowel nucleus information. These indices n are normalized over the sentence, assuming known sentence boundaries, to range from zero to five (the scale used for the initial perceptual labeling). The correlation co-efficient between the hand-labeled break indices and the automatically generated break indices was very good: 0.85.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML