File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/94/a94-1022_evalu.xml
Size: 4,251 bytes
Last Modified: 2025-10-06 14:00:15
<?xml version="1.0" standalone="yes"?> <Paper uid="A94-1022"> <Title>The Delphi Natural Language Understanding System</Title> <Section position="11" start_page="136" end_page="136" type="evalu"> <SectionTitle> 8 Results Of Formal Evaluation On ATIS </SectionTitle> <Paragraph position="0"> Our complete system including the Semantic Linker was evaluated in the December 1993 ARPA ATIS evaluation. Prior to evaluation, ATIS versions of the system's domain model, lexicon and realization rules had been developed using several thousand utterances of training data collected from users of ATIS. An approximately 1000-utterance set was held aside as a blind test set on which all participating sites were evaluated.</Paragraph> <Paragraph position="1"> Error rate in this evaluation was defined as F+NA, where F was the percentage of queries answered incorrectly, and NA the percentage of queries not answered at all. There were two evaluations on the same corpus using this metric: one of NL text understanding alone, and the other of a complete spoken language system (SLS) comprised of Delphi and the Byblos recognizer.</Paragraph> <Paragraph position="2"> Our system achieved an official result of 14.7% on the NL test, which was the third-lowest error rate achieved.</Paragraph> <Paragraph position="3"> The SLS error rate was 17.5%.</Paragraph> <Paragraph position="4"> Our own experiments show that using the Semantic Linker reduced our system's error rate on the NL test by 43%. This was largely achieved by dramatically lowering the no-answer rate NA from 18.7% to 2.3%. Just over 80% of this increment of sentences answered were answered correctly, so the Linker showed considerable accuracy.</Paragraph> <Paragraph position="5"> 9 Porting Delphi to the SPLINT Domain The SPLINT (Speech and Language Integration) domain is concerned with Air Force units and their component aircraft, weaponry and other physical attributes of aircraft, ordnance, and facilities (such as air bases, runways, bunkers, etc.). The SPLINT database has 106 fields in 23 tables.</Paragraph> <Paragraph position="6"> Some example utterances in the SPLINT domain are: What aircraft types are assigned to the 32nd? Which base has a unit carrying mavericks? Can a Stealth use Langley's runway 1 ? In order to port Delphi to the SPLINT domain, SPLINT-specific versions of the domain model, lexicon, realization rules and db-mapping rules were needed. For the speech-understanding part of the application, word pronunciations were also neccesary, as well as word-class membership for a statistical n-gram class grammar. Delphi includes &quot;core&quot; versions of some of these knowledge bases: a core domain model with common classes like NUMBER and TIME-OF-DAY and relations like GREATER, a core lexicon with closed-class items such as prepositions as well as words appropriate to question-answering in general such as &quot;show&quot;, to which domain-specific items have to be added.</Paragraph> <Paragraph position="7"> In porting to SPLINT, 60 classes and 65 relations were added to the domain model. 400 words were added to the lexicon. Of these, approximately half were derived from database field values. 118 realization rules were added.</Paragraph> <Paragraph position="8"> The grammar did not need to be modified, with the exception of adding one rule (for constructions such as &quot;Mach 1&quot;).</Paragraph> <Paragraph position="9"> The entire process took about a person month to get 90% coverage on a 1400 sentence corpus, developed independently by a non-NL person. An additional person week was required to develop the speech-related knowledge bases. A complete spoken language system with Delphi as the understanding component, plus a Motif-based user interface, was succesfully demonstrated at the 1994 ARPA Human Language Technology meeting, and at Rome Labs in New York. The porting process is described in more detail in (Bates, 1994).</Paragraph> <Paragraph position="10"> This effort demonstrates that, given an appropriate system design, it is possible to build a complete spoken language system that is robust to speech and production errors, and to do so rapidly and straightforwardly.</Paragraph> </Section> class="xml-element"></Paper>