File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/93/h93-1023_concl.xml
Size: 1,749 bytes
Last Modified: 2025-10-06 13:57:03
<?xml version="1.0" standalone="yes"?> <Paper uid="H93-1023"> <Title>Topic and Speaker Identification via Large Vocabulary Continuous Speech Recognition</Title> <Section position="6" start_page="123" end_page="123" type="concl"> <SectionTitle> 5. CONCLUSIONS </SectionTitle> <Paragraph position="0"> As the Switchboard testing demonstrates, message identification via large vocabulary continuous speech recognition is a successful strategy even in challenging speech environments. Although the quality of the recognition as measured by word accuracy rates was very low for this task - only 22% of the words were correctly transcribed the recognizer was still able to extract sufficient information to reliably identify speech messages. This supports our belief in the advantages of using articulatory and language model context.</Paragraph> <Paragraph position="1"> We were surprised not to find a more pronounced benefit from using large numbers of keywords for the topic identification task. Our prior experience had indicated that there were small but significant gains as the number of keywords grew and, although such a pattern is perhaps suggested by the results in Table 2, the gains (beyond those in the recalibration estimates) are too small to be considered significant. It is possible that with better modelling of keyword frequencies or by introducing acoustic distinctiveness as a keyword selection criterion, such improvements might be realized.</Paragraph> <Paragraph position="2"> Given the strong performance of both of our identification systems, we also look forward to exploring how much we can restrict the amount of training and testing material and still maintain the quality of our results.</Paragraph> </Section> class="xml-element"></Paper>