File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/91/h91-1019_intro.xml
Size: 3,109 bytes
Last Modified: 2025-10-06 14:05:01
<?xml version="1.0" standalone="yes"?> <Paper uid="H91-1019"> <Title>A Textual processor to handle ATIS queries</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> INTRODUCTION </SectionTitle> <Paragraph position="0"> Speech recognition systems have made significant progress in recent years toward the goal of correctly interpreting continuously-spoken utterances. However, substantial restrictions are usually imposed upon the speaker, to guarantee success. Typically, one must either pause after each word, restrict one's choice of words to a small vocabulary, and/or train the system to adapt to one's voice. In many systems, it is not feasible to insist on vocabulary restrictions, nor can the system always be trained ahead of time to a user's voice. Many applications over the telephone to serve the general public will be of this latter type~ Furthermore, most users do not like altering their speaking style, and especially not speaking in isolated-word formant.</Paragraph> <Paragraph position="1"> Thus most practical applications of the future will have to be speaker-independent (i.e., trained ahead of time by other speakers), without major restrictions in vocabulary, and be able to accept normal, spontaneous speech.</Paragraph> <Paragraph position="2"> In particular, one major application is allowing the general public to do transactions directly with computer databases (including over the telephone). As an example of this type of interaction, we are currently examining a system to permit direct access for a user to air travel information. A user can pose natural questions to the database and receive answers just as a travel agent does. The database is that of the Official Airline Guide (OAG).</Paragraph> <Paragraph position="3"> To simplify the task slightly, we use a subset of the flights in the OAG: only those for airports at nine major US cities (Atlanta, Boston, Baltimore, Denver, Dallas, Oakland, Philadelphia, Pittsburgh, and San Francisco). Otherwise, the entire OAG database is used.</Paragraph> <Paragraph position="4"> In the future, we will investigate actual voice dialogues between a user and the database, but for now the subject of this study is limited to the analysis of individual queries by users. We wish to design an automatic system to correctly respond to the user with the desired OAG information. As a first step toward this goal, the current study is further limited to the analysis of textual versions of the user's utterances, rather than the speech itself. Thus we assume perfect operation of an initial speech recognizer, which would accept the spontaneous queries of a user and output the word sequence corresponding to the speech. Such word sequences can have grammatical mistakes and repeated words, as often occur in natural speech. Our textual processor must handle such deviations from normal written text that occur with spontaneous speech. In particular, this means that one cannot rely directly on standard English text processors, which presume grammatical input text.</Paragraph> </Section> class="xml-element"></Paper>