File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/w96-0204_intro.xml

Size: 13,960 bytes

Last Modified: 2025-10-06 14:06:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="W96-0204">
  <Title>Modeling Conversational Speech for Speech Recognition</Title>
  <Section position="3" start_page="34" end_page="37" type="intro">
    <SectionTitle>
2 Annotations of Switchboard
</SectionTitle>
    <Paragraph position="0"> There were three major kinds of annotations done as part of the dysfluency annotation of Switchboard: sentence boundaries, restarts, and non-sentence elements. The assumption underlying the dysfluency annotation was that when it was complete, sentences could be separated, restarts &amp;quot;folded&amp;quot;, and non-sentential elements removed and the result would be a reasonably grammatical sentence (that is, grammatical for conversational speech, not necessarily compliant with your third grade English teacher), though some may not be &amp;quot;complete&amp;quot; in that they may be replies or acknowledgments and some may be interrupted by either the speaker herself or the other conversant and never completed.</Paragraph>
    <Paragraph position="1"> As mentioned earlier, much of the choice of what to annotate and details of the notation are based on the work of Shriberg (1994). The main difference is that our work is not as detailed as Shriberg's, since we were not planning as fine grained analysis, and it covered significantly more data (Shriberg annotated 40,000 words, whereas this effort annotated 1.4 million words). In the next three sections, we describe these types of annotations and provide some examples.</Paragraph>
    <Section position="1" start_page="34" end_page="35" type="sub_section">
      <SectionTitle>
2.1 Sentences
</SectionTitle>
      <Paragraph position="0"> In written text, the definition of a sentence is clear and marked in the text itself by capitalization and punctuation.</Paragraph>
      <Paragraph position="1"> For conversational speech, the most natural division would appear to be the turn, when one speaker stops speaking and another starts. However, when we look at the data, we see  that participants often interrupt and talk over one another, so even separating turns is not so simple. Within a turn a participant may ramble on and on, making the utterance too long for a speech recognizer to handle.</Paragraph>
      <Paragraph position="2"> In annotating Switchboard, we choose to divide turns into &amp;quot;sentences&amp;quot; consisting each of a single independent clause. When two independent clauses are connected by a conjunction, they are divided with the conjunctions marked as described in SS2.3.4.</Paragraph>
      <Paragraph position="3"> Sentence units are followed with &amp;quot;P' indicating a sentence boundary, as shown in example 1. A sentence is considered to begin either at turn beginning, or after completion of a preceding sentence. Any dysfluencies between the end of a previous sentence and beginning of the current one is considered part of the current sentence.</Paragraph>
      <Paragraph position="4"> In Example 2, there are essentially two sentences. The first sentence is across a turn by speaker A, namely &amp;quot;we did get the wedge cut out by building some kind of a cradle for it&amp;quot;. The other sentence is by speaker B which is &amp;quot;A cradle for it&amp;quot;. &amp;quot;You know&amp;quot; at the end of a sentence (Example 3) is considered as a part of the current sentence, as described in more detail in SS2.3.1.</Paragraph>
      <Paragraph position="5"> Ex 1: A: You interested in woodworking? / Ex 2: A: we did get the wedge cut out by building some kind  of--B: A cradle for it. / A: -- a cradle for it. / Ex 3: B: I painted, about eight different, colors, you know. / the crayons that are sticking up, it will be the headboard -- / Each sequence of words consisting of only continuers or assessments (expressions such as &amp;quot;uh-huh&amp;quot;, &amp;quot;right&amp;quot;, &amp;quot;yeah&amp;quot;, &amp;quot;oh really&amp;quot;) is also coded as a sentence, as in Examples 4 and 5.</Paragraph>
      <Paragraph position="7"> Sentences that do not end normally are treated as incomplete sentences. They are marked with &amp;quot;-P'. In some cases the speaker stops a sentence and starts over (in contrast with restarts where just a few words are repeated, as described in SS2.2). In other cases, the other participant in the conversation interrupts the speaker and the speaker never finishes the sentence (in contrast with cases such as example 2 above, where the first speaker finishes the sentence in the next turn after or during the interruption).</Paragraph>
      <Paragraph position="8"> Ex 1: B: what I've seen of this kind before is you have the, -/ if you're looking at adding on you have, -/ Ex 2: A: Perhaps things that we didn't think of before and just concentrated on the lawmaking or the results that would be seen in public works or bills that are passed or, et cetera like that -- -/ Ex 3: B: -- it was very unfortunate thing that occurred there / it's, -/ A: Where do you live? / B: we live in Utah. /</Paragraph>
    </Section>
    <Section position="2" start_page="35" end_page="35" type="sub_section">
      <SectionTitle>
2.2 Restarts
</SectionTitle>
      <Paragraph position="0"> Restarts are considered to have the following form in Shfiberg's work and elsewhere. The initial part is the reparandum (RM), which is the part that the speaker is going to repair. The interruption point (IP) markes the end of the reparandum and it is followed by an optional interregnum (IM), which includes editing phases, such as a filled pause or editing terms. Finally, the repair (RR) is what the speaker intends to replace the RM with.</Paragraph>
      <Paragraph position="1"> Show me flights from Boston on uh from Denver on Monday</Paragraph>
      <Paragraph position="3"> In order to simplify the notation, the restart notation we developed marks only the boundaries of the entire restart (RM to RR) with square brackets and the interruption point with a &amp;quot;+&amp;quot; . Partial words are also not marked specially (though in the transcripts they end in a &amp;quot;-&amp;quot;); they appear directly to the left of the interruption point. In contrast with Shriberg's work, no internal structure of the restart is included (e.g. which words are repeated, substituted or deleted).</Paragraph>
      <Paragraph position="4"> Show me flights \[from Boston on + {F uh } from Denver on \] Monday A restart is &amp;quot;repaired&amp;quot; by deleting the material between the open bracket and the interruption point (+). (Note that fillers such as &amp;quot;uh&amp;quot; in the above example are deleted as a separate process in cleaning the text. We discuss them in SS2.3) Some examples of restarts and repairs are given below. Note in Example 1, it is not always clear how much should appear in the repair. &amp;quot;In the book&amp;quot; could also have been included. However, to try to reduce the variation in annotation, annotators were instructed to keep the repair as short as possible. In Example 2, it can be seen that a restart has been marked across a turn, with the RM and IM in one turn and the RR in the next turn.</Paragraph>
      <Paragraph position="5"> Ex 1: A: \[ it, + the instructions \] in the book I had said use a coping saw but there's no coping saw big enough \[ to, + for \] a fourteen inch wide watermelon /  A: -- pine?/ B: It's, \] plywood face I guess.</Paragraph>
      <Paragraph position="6"> In the second restart in Example 3, it is not clear what is the RR for the RM &amp;quot;and&amp;quot;. In cases where there does not appear to be a suitable replacement for the restart, annotators were instructed to place the &amp;quot;\]&amp;quot; as close to the IP as possible. One rule of thumb that can be followed in case of marking restarts without repairs is that they are always at the beginning or in the middle of the sentence, the sentence continues after the restart and the restart usually comprises one to three function words.</Paragraph>
      <Paragraph position="7"> Ex 3: B: \[ I got, + uh it got \] delayed for a little bit \[ and,+ \] because of work !  Multiple restarts are handled as embedded and are repaired from left to right. Some examples of complex restarts are shown below. In Example 2, the left to right annotation has not been strictly observed since &amp;quot;\[ Ber-, + Bermuda \]&amp;quot; appears as a restart with repair within the repair of another restart.</Paragraph>
      <Paragraph position="8">  A: Dress shorts. / B: they're like black corduroy \[ Ber-, + Bermuda \]\] shorts.</Paragraph>
    </Section>
    <Section position="3" start_page="35" end_page="37" type="sub_section">
      <SectionTitle>
2.3 Non-Sentence Elements
</SectionTitle>
      <Paragraph position="0"> Non-sentence elements are words or phrases which are inserted in an utterance, disrupting the flow of the sentence.</Paragraph>
      <Paragraph position="1"> They are simple units with no internal structure and no interruption point. There are five types of non-sentence elements: filled pause {F }, editing term {E }, discourse marker {P }, conjunction {C }, and aside {A }.</Paragraph>
      <Paragraph position="3"> Filled pauses have unrestricted distribution and no semantic content. A few examples of fillers are &amp;quot;uh .... um&amp;quot;, &amp;quot;huh&amp;quot;. There can also be other filled pauses which are rare such as &amp;quot;eh&amp;quot; and &amp;quot;oops&amp;quot;. &amp;quot;oh&amp;quot; can be treated as a filled pause if it appears along with other words for example &amp;quot;oh yeah&amp;quot;, &amp;quot;oh really&amp;quot;, as in Example 1. Otherwise, &amp;quot;oh&amp;quot; is treated as a regular word unit of language if it appears by itself as a reply, as in Example 3.</Paragraph>
      <Paragraph position="4">  Ex 1: B: Ex 2: B: {F Oh }, yeah. / Uh-huh. / Actually, \[ I, + {F uh, } I \] guess I am/\[laughter\]. {F um }, it just seems kind of funny that this is a topic of discussion. / {F uh }, I do, {F uh }, some, {F uh }, woodworking myself/\[noise\]. {F uh }, in fact, I'm in the middle of a project fight now making a bed for my son. / Ex 3: B: Oh/  Editing terms are usually restricted to occur between the restart and the repair and have some semantic content (e.g. &amp;quot;I mean .... sorry&amp;quot;, &amp;quot;excuse me&amp;quot;), as shown in Example 1, through it is possible that editing terms occur outside the  RR.</Paragraph>
      <Paragraph position="5"> Ex 1: A: {F Oh, } yeah, / {F uh, } the whole thing was small and, \[you, + { E I mean, } you\] actually put it on \[laughter\], !  Discourse markers have a wider distribution than explicit editing phrases but are unlike filled pauses in that they are lexical items (e.g. &amp;quot;well&amp;quot;, &amp;quot;you know&amp;quot;, &amp;quot;like&amp;quot;). &amp;quot;You know&amp;quot; is the most frequent discourse marker and is used very frequently by some speakers, as shown in Example 3. There are some other terms such as &amp;quot;so&amp;quot; and &amp;quot;actually&amp;quot; which can also serve as discourse markers, as in Example 2; however, &amp;quot;so&amp;quot; can also be a coordinating conjunction or a subordinating conjunction, as discussed in SS2.3.4. In  Example 4, it can be observed that the discourse marker is within the RR of a restart.</Paragraph>
      <Paragraph position="6"> Ex 1: B: Ex 2: A: Ex 3: B: {P Well }, we have a cat who's also about four years old. / he comes back. / { P So } \[ he, + he's \] pretty good about taking to commands / Yeah, / with, {P you know, } me being at home and just having the one income, {P you know, } you Ex 4: B: don't have, this lot o f extra money \[ to, + to \] do a lot of, {Pyou know, } extra things. / \[ We take, + {P you know, } whenever we take \] them to Showbiz or -/they think it's wonderful just to go to McDonalds, /  Coordinating conjunctions occur at the inter-sentential level and generally include &amp;quot;and&amp;quot;, &amp;quot;but&amp;quot; and &amp;quot;because&amp;quot;. In some cases it is possible that two words together constitute a conjunction, for example &amp;quot;and then&amp;quot;, as in example 2. Most of the conjunctions that appear between two full clauses are marked as coordinating conjunctions. The rule of the thumb to be followed is &amp;quot;split sentences whenever possible&amp;quot; except when the two sentences, if split, are grammatically incorrect (for example the second sentence in the spilt does not have a subject since it is in the earlier sentence).  Ex 1: A: Yeah, / {C and } we got him when he was about eight weeks old ! {C and } he's pretty okay, / Ex 2: B: {C and then } I painted, {F uh }, about eight different, {F uh }, colors, / Example 3 is of &amp;quot;so&amp;quot; as a coordinating conjunction. Note that in Example 4, the second &amp;quot;and&amp;quot; is NOT treated as a coordinating conjunction, as the two sentences it conjoins (&amp;quot;I call him&amp;quot; and &amp;quot;he comes back&amp;quot;) are both short and both appear to be modified by the initial &amp;quot;if&amp;quot; clause. Ex3: B: {PWell},{Fuh},wejustmovedrecently \[laughter\] / {C so} now we're in the, {F uh }, Dallas area / Ex 4: A: he's pretty good. / He stays out of the street / {C and, } {Fuh }, ifl catch him I call him and he comes back. / {P So } \[he, + he's \] pretty good about taking to commands /  This is a category for &amp;quot;asides&amp;quot; that interrupt the flow of the sentence. Interjections are rare and are considered only when the corresponding sequence of words interrupt the fluent flow of the sentence AND the sentence later picks up from where it left. The examples below clearly illustrate this.  Ex 1: B: I, {Fuh }, talked about how a lot of the problems they have to \[ come, + overcome \] \[ to, + {F uh, } {A it's a very complex, {F uh, } situation } to \] go into space. / Ex2: A: {P So } we built a cradle for it / {C and } \[wegotth, + {A once it was turned, } we got \] \[ one s-, + one \] cutout on the table saw, on the radial saw, /</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML