File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/h91-1017_metho.xml
Size: 21,433 bytes
Last Modified: 2025-10-06 14:12:41
<?xml version="1.0" standalone="yes"?> <Paper uid="H91-1017"> <Title>Interface Bugs Ignored Test Set System % Correct % Incorrect % No Answer Total Error</Title> <Section position="3" start_page="0" end_page="106" type="metho"> <SectionTitle> OVERVIEW </SectionTitle> <Paragraph position="0"> The DARPA speech and natural language community has recently adopted the domain of air travel information for its spoken language systems. The domain has been named ATIS, for air travel information system. Training and test data are gathered for this common task by employing speakers to verbally interact with a database in order to solve one or more randomly assigned, predefined problems with predefined goals and prefere~es. To perform their task, speakers must verbally query an air travel database. Speakers are not required to use complete, well formed or syntactically correct utterances. They can ask for any information, regardless of whether the request is reasonable, or whether the information contained in the database. Hence, true spontaneous speech is generated. Data is collected by recording beth subject utterances and database responses as speakers perform these verbal tasks. The recorded data is then divided into training and test sets of use by the DARPA community. null Thus far, the utterances selected for evaluation test sets are highly constrained, being restricted to those utterances which can be answered using the information in the database and can either be interpreted and answered in isolation, or by using only the context of a preceeding utterance. All utterances that are ambiguous, refer to objects and actions not included in the database, request information that is not available, or contain spontaneous speech phenomena such as mid-utterance oral edits and corrections are removed from the official test sets.</Paragraph> <Paragraph position="1"> For these evaluations, we designed the SOUL system to enhance the performance of the CMU ATIS system when operating in isolated or limited context modes. The system operates in an opportunistic manner. It is only called upon to perform post processing when there is reasonable uncertainty in the case-frame output. This uncertainty can result from large regions of unaccounted-for speech, multiple, competing interpretations and seemingly incomplete or un-meaningful interpretations. Input to SOUL is the utterance, all the words and phrases matched by the case-frame parser, PHOENIX, mad a set of hypothesized interpretation frames (or instantiated case-frame). SOUL outputs either an error message (e.g. No information in database on BOULDER) or a single interpretation composed of corrections, deletions, and all forms of modifications to the instantiated case-frame It does this by using a large semantic and pragmatic knowledge base in conjunction with abducfive reasoning and constraint satisfaction techniques. A by-product of the SOUL design is thatit also provides much of the semantic and pragmatic knowledge required for our complete dialog and prediction facilities, previously called MINDS system (1988, 1989a'b, 1990) \[1, 2, 3, 4\].</Paragraph> <Paragraph position="2"> As SOUL was designed to deal with all spontaneously generated utterances, we expect there will be some advantages for using the system while processing the required, highly restricted data, but that the system will be far more valuable when processing unrestricted input. To evaluate the effectiveness and relative payoff of SOUL, we investigated the following two issues.</Paragraph> <Paragraph position="3"> First, we wanted to see how much, ff any impact use of a large semantic and pragmatic knowledge base would have in reducing error relative to a semantically based (by definition) case-frame parser \[5, 6, 7\] when processing only isolated utterances or utterances that can be interpreted using only very limited context. Caseframe parsers employ both semantic and syntactic knowledge. They do not use an extensive knowledge base or inferencing procedures. However, they have proven to be very robust and effective in producing interpretations of both well formed and ill-formed user input.</Paragraph> <Paragraph position="4"> Secondly, we wanted to determine if the use of a knowledge base alone would allow us to process all types of utterances, including those which are un-answerable, request information not in the database and outside the definition of system capabilities, are ambiguous, as well as to be able to detect those utterances which can ordy be answered by using unavailable contextual information.</Paragraph> <Paragraph position="5"> To evaluate these questions, we assessed performance of our case-frame speech parser, PHOENIX, both with and without SOUL on four independent test sets. Two of the test sets contained unrestricted data -- every utterance generated was processed. The other two test sets (DARPA ATIS0 and ATIS1) contained restricted utterances, as described above.</Paragraph> <Paragraph position="6"> The remaineder of this paper describes how SOUL uses semantic and pragmatic knowledge to correct, reject and/or clarify the outputs of the POENIX case-frame parser in the ATIS domain.</Paragraph> <Paragraph position="7"> The next section summarizes some of the linguistic phenomena which Soul addresses. The following section briefly summarizes how it works to correct inaccurate parses. The last section presents the results of four performance evaluations which conmist the performance of the PHOENIX case-frame parser with and without the SOUL postprocessor.</Paragraph> </Section> <Section position="4" start_page="106" end_page="107" type="metho"> <SectionTitle> LINGUISTIC PHENOMENA EXAMINED </SectionTitle> <Paragraph position="0"> SOUL was developed to cope with errors produced by the CMU PHOENIX speech and transcript parsing software in the ATIS domain. Specifically, SOUL augments the basic, rapid pattern matching and speech recognition functions of the PHOENIX system with knowledge intensive reasoning techniques for more free grained analysis of the preliminary alternative interpretations. Initially we analyzed the performance of the PHOENIX system on a set of training data. The data consisted of 472 utterances comprising dialogs b0 through bn of the ATIS0 training set. An evaluation of the performance of the original PHOENIX system on this data revealed that PHOENIX experienced difficulties with the following problernatie linguistic phenomena, which composed a total of 44.3 percent of the utterances: (Note: underlined information not in database) Unanswerable queries, no Information in database, or Illegal action requested) (Found in 19.3% of sentences In the trahalng corpus) What ground transportation is available from the airport in Denver to Boulder at three pm on the twenty second? How do I make reservations? Show all the flights from Dalla.._~ to Fort Worth Interpreting these utterances requires knowledge on the limitations of the database, detection of user mis-conceptions and constraint violations as well as the ability to recognize and understand information not contained in the database.</Paragraph> <Paragraph position="1"> Context dependent utterances (9.2%) Show me all returning flights To process isolated utterances or utterances that are only allowed to be interpreted using limited contextual information, it is helpful to be able to recognize those utterances where critical information cannot be reasonably inferred.</Paragraph> <Paragraph position="2"> Ungrammatical and ill-formed utterances (3.0%) What date does flight eight seventy seven from San Francisco to Dallas leave from? Ungrammaticalitty is a part of spontaneous speech. However, one can also obtain ill-formed or ungrammatical input from mis-recognition of an input string. These phenomena preclude using a strict syntactic constraints and clues such as definite reference or any type of case marker such as those typically used in textual case-frame parsers.</Paragraph> <Paragraph position="3"> Ambiguous queries (6A%) Whatdegs the distance from San Francisco to Oakland? The example query can be interpreted as meaning the city San Francisco or San Francisco International airport. In the case of the former, no information is contained in the database. In the absence of disambiguating context, it is important to be able to recognize all interpretations.</Paragraph> <Section position="1" start_page="106" end_page="107" type="sub_section"> <SectionTitle> Yes/No and Quantified Yes/No's (3.2%) </SectionTitle> <Paragraph position="0"> Do all of the flights from Pittsburgh to Boston serve meals? These, as well as the next category of utterances require that the critical information be detected from the input.</Paragraph> <Paragraph position="1"> However, they are not problematic when accurately recognized from the speech input.</Paragraph> <Paragraph position="2"> Superlatives and Comparatives (3.2%) What's the cheapest flight from Atlanta to DFW? SOUL was designed to provide a free grained analysis of input in an opportunistic manner by relying upon a large knowledge base. It was also tuned to &quot;pay attention&quot; to the above listed linguistic phenomena that posed problems for the original version of the PHOENIX speech processing ease-frame parser. Furthermore, it was also designed to address some of the problems inherent in spontaneously uttered input.</Paragraph> <Paragraph position="3"> THE CMU SYSTEM: OVERVIEW The CMU ATIS System is composed of three interacting modules; a speech recognizer (SPHINX), a robust case-frame parser adapted for spontaneous speeeh (PHOENIX), and a semantic and pragmatic processor (SOUL) which can work either on isolated utterances or can incorporate the dialog and prediction functionality of the MINDS system. For results reported in this paper, SPHINX produces a single output string which is then processed by the speech case-frame parser, PHOENIX (Ward, 1990). The PHOENIX case-frame parser builds plausible parses of the input string by using robust slot filling heuristics. All possible case-frame slots associated with any portion of an input utterance with a reasonable recognition probability are filled. Then candidate case-frames are built, bottom up, which try to account for as much of the matched input as possible. Once candidate interpretations are generated, PHOENIX either sends all interpretations to SOUL or else, when operating in the absence of SOUL (or when an unambiguous interpretation exists), selects the interpretation which accounts for the most input. In either case, one interpretation is selected. This interpretation is then put into a cannonical form and mapped into an SQL TM database query. This query is then passed to the database and output is presented on a computer screen to the user.</Paragraph> </Section> </Section> <Section position="5" start_page="107" end_page="107" type="metho"> <SectionTitle> THE SOUL SYSTEM </SectionTitle> <Paragraph position="0"> SoUL relies on a semantic and pragmatic knowledge base to check for consistency in the output interpretations produced by the parser. There are three special features abeut this frame-based system. First, SoUL not only defines legal values, attributes, and concepts within the ATIS domain, but it also accounts for much extra-domain information as well. Second, it uses inheritance and reasoning to determine contextual constraints, so consistency and constraints can be maintained for combinations of information never before seen. Third, the system uses a single reference data structure for determining illegal input (for which action is prohibited) and unanswerable input (for which no information is in the database).</Paragraph> </Section> <Section position="6" start_page="107" end_page="107" type="metho"> <SectionTitle> REASONING </SectionTitle> <Paragraph position="0"> The mechanisms underlying SoUL's abilities are the use of constraint satisfaction techniques in conjunction with what has been called abductive reasoning \[8, 9\] or concretion \[10\]. These are general, domain independent techniques which rely upon a domain specific knowledge base. The abductive reasoning component is used to evaluate alternative or candidate phrase matches and to decide which phrases modify one another and to determine which can be put together to form one or more meaningful utterances.</Paragraph> <Paragraph position="1"> To illustrate, consider the following utterance: &quot;Does B stand for business class&quot;. The case-frame parser instanriates the following sequence of concepts or cases: B = abbreviation B = code B = letter Does = list Does = Explain business class = class-of-service class = class-of-service stand-for = mean.</Paragraph> <Paragraph position="2"> The knowledge base is faced with three basic concepts: B, which is an instance of some abbreviation. Specifically, B is an abbreviation for Breakfast and for Business-Class. B can also be a letter, which can be part of a flight number identifying the airline carrier, or part of the call letters of an aircraft, or one of the letters composing the name of a person reserving a ticket.</Paragraph> <Paragraph position="3"> Stand-for indicates equivalence, a specific predicate that is either true or false. Business-class is an interpretation preferable to class alone, as it is more specific. Given an equivalence predicate and a specific concept business-class, the only allowable interpretation of &quot;B&quot; is that its an abbreviation. Even in the absence of an equivalence predicate, there is no additional information which would support the interpretation of &quot;B&quot; as being part of a flight number, carrier identification, aircraft call number or a person's name. Now, given &quot;Business-class&quot;, an instance of a fare class, an equivalence relationship and a choice between alternative abbreviation expansions for B, the only expansion which would make the predicate true is the instance of B abbreviating &quot;Business-class&quot;.</Paragraph> </Section> <Section position="7" start_page="107" end_page="107" type="metho"> <SectionTitle> CONSTRAINT REPRESENTATION AND USE </SectionTitle> <Paragraph position="0"> The abductive component not only determines what possible phrases compose a meaningful request or statement, it also spots combinations which violate domain constraints. Examples of the types of constraint violations which are recognized include violations on beth type constraints of objects and attributes as well as n-tuple constraint violations.</Paragraph> <Paragraph position="1"> To illustrate, consider the following rule for long range transportarion taken from the current knowledge base: Objects: long-range vehicle, origin-location, destination-location inanimate~animate-objects to-be-transported. Here we have constraints not only on the type of objects that may fill these roles (and of course information abeut these objects is contained in other portions of the knowledge base) but we have relational constraints as well. The following are the single constraints on the objects involved in long-range transportation. Vehicles are constrained to be included in the instances of a long range vehicle. These include airplanes, trains and cars that are not taxi or limosines. The origin and desrinarion are constrained to be either airports (or their abbreviations ) or locations that must include a city (and may include additional information such as state and/or location within or relative to the city. In this example, there is a single relational, or tuple constraint. It poses res~crions on the relationship between the origin and destination slot fdlers. These include: If two dries are involved, they cannot be the same, and there must be a set difference between the listings of the airports that service the two dries. If two airports are involved, they must not be the same airport. If a city and an airport are involved, the city must be served solely by the airport listed. Under these rules, you cannot fly from Dallas to Fort Worth in this database. However, you can fly from San Francisco to San Jose. Similar rules for short-range transportation would rule out taking a taxi from Pittsburgh to Boston.</Paragraph> <Paragraph position="2"> These types of definitions for events, actions, objects, etc. and the constraints placed upon them also allow one to determine whether or not there is sufficient information upon which to take an action. Hence, if a flight is not clearly delineated given whatever context is allowable under the test set rules, these rules can determine whether a query is context dependent or insufficiently specified.</Paragraph> </Section> <Section position="8" start_page="107" end_page="108" type="metho"> <SectionTitle> EXAMPLES </SectionTitle> <Paragraph position="0"> The following examples from the ATIS corpus further illustrate how the reasoning component operates.</Paragraph> <Paragraph position="1"> What is the shortest flight from Dallas to Fort Worth? PHOENIX would look for flights from Dallas to Fort Worth, assuming a correct interpretation. However, SOUL knows that Dallas and Fort Worth are beth served only by DFW airport. Since you cannot takeoff and land in the same place, this is an illegal and unanswerable request.</Paragraph> <Paragraph position="2"> How much would it cost to take a taxi from Pittsburgh to Boston? PHOENIX recognizes &quot;How much would it cost from Pittsburgh to Boston&quot;, and would output the corresponding list of flights and fares. SOUL recognizes that &quot;to take a taxi&quot; is important information that has not been included in the interpretation. It also knows that taxis are short-range transportation vehicles. If the request were legal, SOUL would tell PHOEENqX tO add taxi as method of transportation and delete airplanes as the transportation vehicle. However, this request violates the constraint on what constitutes a short-range trip, so SoUL outputs the violation type to the speaker (or generates an error message as the CAS).</Paragraph> <Paragraph position="3"> Are there any Concord flights from Dallas to Boston? Here PHOENIX find a request for flights between Dallas and Boston.</Paragraph> <Paragraph position="4"> SOUL tells the parser to &quot;add Aircraft-Class aircraft_code SSC&quot;. Show all the flights to San Francisco on August 18. Here SoUL recognizes that a departure location has been omitted and cannot be found in any unaccounted-for input. Hence, this is a context-dependent sentence.</Paragraph> </Section> <Section position="9" start_page="108" end_page="108" type="metho"> <SectionTitle> SOUL OUTPUT </SectionTitle> <Paragraph position="0"> SoUL takes as input the candidate parses as well as the input string. The output is a list of instructions and codes which are sent back to PHOENIX so they can be incorporated into the database query. Specifically, SoUL outputs an existing interpretation augmented by the following information: * When there is missing, critical information, database tables and bound variables are output.</Paragraph> <Paragraph position="1"> * When there is a reasonable selection of information interpreted under an incorrect top level frame (e.g.</Paragraph> <Paragraph position="2"> air travel vs ground travel) it provides a correct top level interpretation or frame.</Paragraph> <Paragraph position="3"> * When a specific word string is mis-interpreted, it provides corrections in the form of additional variables and their bindings to add as well as variables and bindings to delete.</Paragraph> <Paragraph position="4"> * When a query involves information not included in the current, restricted database, when the query requires information that is out of the chosen domain, and when the user is asking the system to perform a function it is not designed to do, it outputs specific error codes to PHOENIX indicating what the problem is and outputs specific corrective infermarion to the user screen. This is designed to correct user mis-conceptions. (e.g. Show all flights from Dallas to Fort Worth).</Paragraph> <Paragraph position="5"> * Finally, when two un-related queries are merged into a single utterance, the system outputs a &quot;break point&quot; so PHOENIX can re-parse the input into two separate requests. (For example the utterance Show all flights from PhUladelohia to Boston and how far Boston is from Cape Cod would be divided up where as Show all flights from Philladelphia to Boston as well as their minimum fares, would not.</Paragraph> <Paragraph position="6"> To summarize, SoUL looks for and corrects the following types of problems: (1) Information that is missing from the output of PHOENIX, which is added by SoUL. When too much information is missing, the system produces the &quot;do-not-understand&quot; response. (2) Constraint violations. The speaker is informed what is unanswerable, or the parser is given instructions on how to reinterpret the input. (3) Inaccurate parses, where SOUL tells the parser any combination of a basic interpretation frame, infermarion to add, information to delete, or regions to reparse for a specific meaning or variable. (4) Unanswerable queries and commands, which produce a message to the speaker describing what carmot be done.</Paragraph> </Section> class="xml-element"></Paper>