File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/88/c88-2118_intro.xml

Size: 3,295 bytes

Last Modified: 2025-10-06 14:04:44

<?xml version="1.0" standalone="yes"?>
<Paper uid="C88-2118">
  <Title>Parsing Noisy Sentences</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> There have~ been a few attempts to integrate a speech recognition device with a natural language understanding system. Ita~,es et. al /Hayes86/ adopted the technique of caseframe instantiation to parse a continuously spoken English sentence in the form of a word lattice (a set of word candidates hypothesized by a speech recognition module) and produce a frame representation of the utterance. Poesio and Rullemt /Poesio 1987/ suggested a modified implementation of the caseframe parsing to parse a word lattice in :italian. Lee et. al /Lee 1987/ developed a prototype Chinese (Mandarin) dictation machine which takes a syllable lattice (a set of syllables, such as \[guo-2\] and \[tieng-:l\], hypothesized by a speech recognition module) and produces a Chinese character sequence which is both syntactically and semantically sound.</Paragraph>
    <Paragraph position="1"> In this paper, we try to parse a Japanese utterance in the form of a sequence of phonemes.1 Our speech recognition device, which is a high-speed speaker-independent system developed by Matsushita Research Institute/Morii 1985/, /Hiraoka 1986/ takes a continuous speech utterance, for 1. Phonemes (e.g./g\],/ed, Is/, etc.) are even lower level units than syllables. 2. We distinguish noisy from ill-formed. The former is due to recognition device errors, while the latter is due to human users. example &amp;quot;megaitai&amp;quot; (&amp;quot;I have a pain in my eye.&amp;quot;), from a microphone and produces a noisy phoneme sequence such as &amp;quot;ebaitaai.&amp;quot;2 The speech recognition device does not have any syntactic or semantic knowledge. More input/output examples of the speech device are presented in Figure 1-1.</Paragraph>
    <Paragraph position="2">  Note that the speech recognition device produces a phoneme sequence, not a phoneme lattice; there are no other phoneme candidates available as alternates. We must make the best guess based solely on the phoneme sequence generated by the speech device. Errors caused by the speech device can be classified into three groups: * Altered Phonemes -- Phonemes recognized incorrectly.</Paragraph>
    <Paragraph position="3"> The second phoneme /b/ in &amp;quot;ebaitaai&amp;quot; is an altered phoneme, for example.</Paragraph>
    <Paragraph position="4"> * Missing Phonemes -- Phonemes which are actually spoken but not recognized by the device. The first phoneme /nd in &amp;quot;megaitai&amp;quot;, for example, is a missing phoneme. * Extra Phonemes -- Phonemes recognized by the device which are not actually spoken. The penultimate phoneme /a/in &amp;quot;ebaitaai&amp;quot;, for example, is an extra phoneme. To cope with these problems, we need: * A very efficient parsing algorithm, as our task requires much more search than conventional typed sentence parsing. And * A good scoring scheme, to select the most likely sentence out of multiple candidates.</Paragraph>
    <Paragraph position="5"> In sections 2 and 3, we describe the parsing algorithm and the scoring schelhe, respectively.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML