File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/90/h90-1049_concl.xml
Size: 2,199 bytes
Last Modified: 2025-10-06 13:56:33
<?xml version="1.0" standalone="yes"?> <Paper uid="H90-1049"> <Title>Word Recognition Using Dynamic Programming Neural Networks&quot;, by Sakoe, Isotani, and Yoshida (Readings in Speech Recognition, edited by Alex Waibel & Kai-Fu Lee). Other work includes &quot;Merging Multilayer Perceptrons and Hidden Markov Models: Some Experiments in Continuous Speech</Title> <Section position="9" start_page="246" end_page="248" type="concl"> <SectionTitle> ANN RECOGNITION: The time normalized data for </SectionTitle> <Paragraph position="0"> each word from the utterance is fed as input into each of the neural nets. If the best path word is shorter than a given neural nets' input, additional data is taken from the rest of the best path. Silence is always skipped. If the end of the utterance is reached before enough data is collected, nulls are input to the neural net. For recognition, the individual neural nets are connected together and the output which is most &quot;on&quot; is used to indicate what word.</Paragraph> <Paragraph position="1"> The system was tested on TI Connected Digits Database. Six male speakers from two different dialects were used for training. Three males, MKR, MRD, and MIN were taken from the Little Rock, AR dialect. The other three male speakers, MBN, MBH, MIB, were taken from the Rochester, NY, dialect.</Paragraph> <Paragraph position="2"> The current modifications to Sphinx only produce pointers to the best candidate words during recognition. There are three classes of errors: insertions, deletions, and substitutions. When the HMM scores correctly, the ANN was tested and is in agreement 100% of the time. For the three classes of errors, only substitution errors have been tested with the ANN. From a set of 385 utterances, the Rochester male speakers, nine substitution errors were made by Sphinx. The ANNs corrected four of the nine errors.</Paragraph> <Paragraph position="3"> CONCLUSIONS: A larger set of data needs to be tested before any strong conclusions can be drawn.</Paragraph> <Paragraph position="4"> The initial reduction in error rate by 44% of an already highly tuned system is promising.</Paragraph> </Section> class="xml-element"></Paper>