File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/88/c88-1046_concl.xml
Size: 2,617 bytes
Last Modified: 2025-10-06 13:56:15
<?xml version="1.0" standalone="yes"?> <Paper uid="C88-1046"> <Title>Word Boundary Identification fro m Phoneme Sequence Constraints in Automatic Continuous Speech Recognition</Title> <Section position="9" start_page="229" end_page="230" type="concl"> <SectionTitle> 6. Discussion </SectionTitle> <Paragraph position="0"> --- 3.5%.</Paragraph> <Paragraph position="1"> This study has shown that around 45% of all word boundaries can be correctly identified from a knowledge of three-phoneme sequences that occur across word boundaries but which do not occur word-internally together with a knowledge of one- and two-phoneme words and all two-phoneme sequences that can begin and end words. The result is based on hand-transcriptions which can be considered analogous to the phonemic strings that would be extracted automatically from the acoustic speech signal if the recogniser made no errors in this derivation.</Paragraph> <Paragraph position="2"> A current area of investigation is to identify the set of phoneme sequences which occur neither&quot; across a word boundary nor word-internally. Such phoneme sequences can be easily obtained from the data sets discussed in this paper and they would enable errors to be detected in the acoustic-phonetic stage of processing in a continuous speech recogniser. Some examples of these sequences are given in (37): (37) /1 z ng/,/aa dh l/,/e w n/ For example, /e w n/ must be illegal since it does not occur word-internally and because it does not occur across word boundaries (both/e # w rd and/e w # n/must be ruled out on the grounds that there are no words which end in/e/or/e w/). The incorporation of this kind of knowledge would enable an error to be detected ff such a sequence were derived automatically after the acoustic-phonetic stage of processing.</Paragraph> <Paragraph position="3"> 8. No~s used in this paper is shown below: The CSTR Machine Readable Phonemic Alphabet for BP Ipl pea Ig fan /11 lee ~hi bead tv/ van It~ road It~ tea /th/ think /w/ win /w ~uy /d~ V_hen /y/ y_oa /k/ key is/ s_ing /m/ man Igl gay I~ zoo in/ name /ehl chew /sh/ shoe /ng/ s-/ng /jh/ ~udge /zh/ measur~ /h/ hat liil we Iol /tot leil sta~ li/ hit Ioo/ sa_ww lay sigh ~el head lu/ could_.. ~oil toy /al had /uu/ who /au/ now /aal har_d I@/ the /ou/ go /i@/ here /u@/ sure /e@/ there /@@/ first This research was supported by SERC grant number GR/D29628 and is part of an Alvey funded project in continuous speech recognition. Our thanks to John Laver and Briony Williams for many helpful comments in the preparation of this manuscript.</Paragraph> </Section> class="xml-element"></Paper>