File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/p90-1003_metho.xml

Size: 9,389 bytes

Last Modified: 2025-10-06 14:12:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="P90-1003">
  <Title>PROSODY, SYNTAX AND PARSING</Title>
  <Section position="4" start_page="17" end_page="18" type="metho">
    <SectionTitle>
3 Incorporating Prosody Into
</SectionTitle>
    <Paragraph position="0"> A Grammar Thus far, we have shown that naive and trained listeners can rely on suprasegmental information to separate ambiguous sentences, and we have shown that we can automatically extract information that correlates well with the perceptual labels. It remains to be shown how such information can be used by a parser.</Paragraph>
    <Paragraph position="1"> In order to do so we modified an already existing, and in fact reasonably large grammar. The parser we  use is the Core Language Engine developed at SRI in Cambridge (Alshawi et al. 1988).</Paragraph>
    <Paragraph position="2"> Much of the modification of the grammar is done automatically. The first thing is to systematically change all the rules of the form A --* B C to be of the form A --. B Link C, where Link is a new grammatical category, that of the prosodic break indices. Similarly all rules with more than two right hand side elements need to have link nodes interleaved at every juncture: e.g., a rule A --* B C D is changed into A --~ B Link1 C Link2 D.</Paragraph>
    <Paragraph position="3"> Next, allowance must be made for empty nodes. It is common practice to have rules of the form NP --* and PP ~ ~ in order to handle wh-movement and relative clauses. These rules necessitate the incorporation into the modified grammar of a rule Link --* e. Otherwise, a sentence such as a wh-question will not parse because an empty node introduced by the grammar will either not be preceded by a link, or not be followed by one.</Paragraph>
    <Paragraph position="4"> The introduction of empty links needs to be constrained so as not to introduce spurious parses. If the only place the empty NP or PP etc. could fit into the sentence is at the end, then the only place the empty Link can go is right before it so there is no extra ambiguity introduced. However if an empty wh-phrase could be posited at a place somewhere other than the end of the sentence, then there is ambiguity as to whether it is preceded or followed by the empty link. For instance, for the sentence, &amp;quot;What did you see _ on Saturday?&amp;quot; the parser would find both of the following possibilities:  Hence the grammar must be made to automatically rule out half of these possibilities. This can be done by constraining every empty link to be followed immediately by an empty wh-phrase, or a constituent containing an empty wh-phrase on its left branch. It is fairly straightforward to incorporate this into the routine that automatically modifies the grammar. The rule that introduces empty links gives them a feature-value pair: empty_link=y.</Paragraph>
    <Paragraph position="5"> The rules that introduce other empty constituents are modified to add to the constituent the feature-value pair: trace_on_left_branch--y. The links zero through five are given the feature-value pair empty_link--n.</Paragraph>
    <Paragraph position="6"> The default value for trace_on_left_branch is set to n so that all words in the lexicon have that value.</Paragraph>
    <Paragraph position="7"> Rules of the form Ao -~ A1 Link1 ...An are modified to insure that A0 and A1 have the same value</Paragraph>
    <Paragraph position="9"/>
    <Paragraph position="11"> mation.</Paragraph>
    <Paragraph position="12"> number of parses and parse times (in and without the use of prosodic inforfor the feature trace_on_left_branch. Additionally, if Linki has empty_link---y then Ai+x must have trace_on_left_branch--y. These modifications, incorporated into the grammar-modifying routine, suffice to eliminate the spurious ambiguity.</Paragraph>
  </Section>
  <Section position="5" start_page="18" end_page="19" type="metho">
    <SectionTitle>
4 Setting Grammar Parame-
</SectionTitle>
    <Paragraph position="0"> ters Running the grammar through our procedure, to make the changes mentioned above, results in a grammar that gets the same number of parses for a sentence with links as the old grammar would have produced for the corresponding sentence without links. In order to make use of the prosodic information we still need to make an additional important change to the grammar: how does the grammar use this information? This area is a vast area of research. The present study shows the feasibility of one particular approach. In this initial endeavor, we made the most conservative changes imaginable after examining the break indices on a set of sentences. We changed the rule N --~ N Link PP so that the value of the link must be between 0 and 2 inclusive (on a scale of 0-5) for the rule to apply. We made essentially the same change to the rule for the construction verb plus particle, VP --* V Link PP, except that the value of the link must, in this case, be either 0 or 1.</Paragraph>
    <Paragraph position="1">  After setting these two parameters we parsed each of the sentences in our corpus of 14 sentences, and compared the number of parses to the number of parses obtained without benefit of prosodic information. For half of the sentences, i.e., for one member of each of the sentence pairs, the number of parses remained the same. For the other members of the pairs, the number of parses was reduced, in many cases from two parses to one.</Paragraph>
    <Paragraph position="2"> The actual sentences and labels are in the appendix. The incorporation of prosody resulted in a reduction of about 25% in the number of parses found, as shown in table 1. Parse times increase about 37%.</Paragraph>
    <Paragraph position="3"> In the study by Price et al., the sentences with more major breaks were more reliably identified by the listeners. This is exactly what happens when we put these sentences through our parser too. The large prosodic gap between a noun and a following preposition, or between a verb and a following preposition provides exactly the type of information that our grammar can easily make use of to rule out some readings. Conversely, a small prosodic gap does not provide a reliable way to tell which two constituents combine. This coincides with Steedman's (1989) observation that syntactic units do not tend to bridge major prosodic breaks.</Paragraph>
    <Paragraph position="4"> We can construe the large break between two words, for example a verb and a preposition/particle, as indicating that the two do not combine to form a new slightly larger constituent in which they are sisters of each other. We cannot say that no two constituents may combine when they are separated by a large gap, only that the two smallest possible constituents, i.e., the two words, may not combine.</Paragraph>
    <Paragraph position="5"> To do the converse with small gaps and larger phrases simply does not work. There are cases where there is a small gap between two phrases that are joined together. For example there can be a small gap between the subject NP of a sentence and the main VP, yet we do not want to say that the two words on either side of the juncture must form a constituent, e.g., the head noun and auxiliary verb.</Paragraph>
    <Paragraph position="6"> The fact that parse times increase is due to the way in which prosodic information is incorporated into the text. The parser does a certain amount of work for each word, and the effect of adding break indices to the sentence is essentially to double the number of words that the parser must process. We expect that this overhead will constitute a less significant percentage of the parse time as the input sentences become more complex. We also hope to be able to reduce this overhead with a better understanding of the use of prosodic information and how it interacts with the parsing of spoken language.</Paragraph>
  </Section>
  <Section position="6" start_page="19" end_page="19" type="metho">
    <SectionTitle>
5 Corroboration From Other
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="19" end_page="19" type="sub_section">
      <SectionTitle>
Data
</SectionTitle>
      <Paragraph position="0"> After devising our strategy, changing the grammar and lexicon, running our corpus through the parser, and tabulating our results, we looked at some new data that we had not considered before, to get an idea of how well our methods would carry over. The new corpus we considered is from a recording of a short radio news broadcast. This time the break indices were put into the transcript by hand. There were twenty-two places in the text where our attachment strategy would apply. In eighteen of those, our strategy or a very slight modification of it, would work properly in ruling out some incorrect parses and in not preventing the correct parse from being found. In the remaining four sentences, there seem to be other factors at work that we hope to be able to incorporate into our system in the future. For instance it has been mentioned in other work that the length of a prosodic phrase, as measured by the number of words or syllables it contains, may affect the location of prosodic boundaries.</Paragraph>
      <Paragraph position="1"> We are encouraged by the fact that our strategy seems to work well in eighteen out of twenty-two cases on the news broadcast corpus.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML