File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/89/h89-1011_intro.xml
Size: 2,266 bytes
Last Modified: 2025-10-06 14:04:45
<?xml version="1.0" standalone="yes"?> <Paper uid="H89-1011"> <Title>Speaker Adaptation from Limited Training in the BBN BYBLOS Speech Recognition System</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> Soon after a speech recognition system begins operation, small amounts of new speech data become available to the system as spoken utterances are successfully transcribed to text. This data is of potentially great value to the system because it contains detailed information on the current state of the speaker and the environment. The purpose of rapid speaker adaptation is to utilize such small samples of speech to improve the recognition performance of the system.</Paragraph> <Paragraph position="1"> Speaker adaptation offers other benefits as well. For applications which cannot tolerate the initial training expense of high performance speaker-dependent models, adaptation can trade-off peak performance for rapid training of the system. For typical experimental systems being investigated today on a 1000-word continuous speech task domain, speaker-dependent training uses 30 minutes of speech (600 sentences), while the adaptation methods described here use only 2 minutes (40 sentences).</Paragraph> <Paragraph position="2"> For applications in which an initial speaker-independent model fails to perform adequately due to a change in the environment or the task domain not represented in the training data, adaptation can utilize an economical initial model generated from the speaker-dependent training of a single prototype speaker. Again, looking at typical systems today, speaker-independent models train on 3 1/2 hours of speech (4200 sentences), while adaptation can use a speaker-dependent model trained from 30 minutes (600 sentences).</Paragraph> <Paragraph position="3"> In this paper, we describe the speaker adaptive capabilities of the BBN BYBLOS continuous speech recognition system. Our basic approach to the problem is described first in section 2. Two methods for estimating the speaker transformation are described in section 3. In section 4 we present our latest results on a standard testbed database.</Paragraph> </Section> class="xml-element"></Paper>