XML Viewer - h91-1034

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/91/h91-1034_abstr.xml
Size: 21,580 bytes
Last Modified: 2025-10-06 13:47:10
<?xml version="1.0" standalone="yes"?>
<Paper uid="H91-1034">
  <Title>A Template Matcher for Robust NL Interpretation</Title>
  <Section position="1" start_page="0" end_page="192" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper, we describe the Template Matcher, a system built at SRI to provide robust natural-language interpretation in the Air Travel Information System (ATIS) domain. The system appears to be robust to both speech recognition errors and unanticipated or difficult locutions used by speakers. We explain the motivation for the Template Matcher, describe in general terms how it works in comparison with similar systems, and examine its performance. We discuss some limitations of this approach, and sketch a plan for integrating the Template Matcher with an analytic parser, which we believe will combine the advantages of both.</Paragraph>
    <Paragraph position="1"> Introduction One of the conclusions SRI has drawn from working with the ATIS common task data is that, even with a very constrained user task, there will always be unanticipated expressions and difficult constructions in the spoken language elicted by the task that will cause problems for a conventional, analytical approach to natural-language processing. However, it also seems that requests for only a few types of information account for a very large proportion of the utterances produced by users performing a task like air travel planning. This point is illustrated by some of the more difficult queries in the June 1990 test set: discontinuity. The third example would be straightforward, except for the fact that the verb &amp;quot;servicing&amp;quot; has been substituted for the more conventional &amp;quot;serving.&amp;quot; Despite the difficult linguistic problems posed by these queries, the information they request is very simple--just fares, flights, and airlines for travel between a pair of specified cities.</Paragraph>
    <Paragraph position="2"> Consideration of examples such as these has led us to modify our approach to natural-language processing in spoken language systems. The key modification to our system is the addition of a Template Matcher to provide robust interpretation for the most common types of requests in the task domain. The Template Matcher achieves robustness in two ways: (1) it provides an interpretation when not all the words or constructions in an utterance have been accounted for, and (2) it provides a mechanism for trading-off the risk of wrong answers with the degree of coverage. These properties arise from a mechanism that assigns scores to interpretations, penalizing interpretations that do not account for words in the utterance. The bulk of this paper is devoted to describing the Template Matcher and discussing its performance as a stand-alone system for interpretation of naturM-language queries for the ATIS task. Later in the paper we consider how such a module might best fit into a complete system for spoken-language understanding.</Paragraph>
    <Paragraph position="3"> Give me a list of all airfares for round-trip tickets from Dallas to Boston flying on American Airlines.</Paragraph>
    <Paragraph position="4"> Show me all the flights and their fares from San Francisco to Boston on June second.</Paragraph>
    <Paragraph position="5"> I need information on airlines servicing Boston flying from Dallas.</Paragraph>
    <Paragraph position="6"> In the first example the phrase &amp;quot;flying on American Airlines&amp;quot; apparently modifies &amp;quot;tickets,&amp;quot; with the flights that the tickets are for apparently being the implied sub-ject of &amp;quot;flying.&amp;quot; The second example seems to contain a discontinuous constituent, &amp;quot;flights .. from San Francisco to Boston on June second,&amp;quot; which is the antecedent of the pronoun &amp;quot;their&amp;quot; that occurs in the middle of the Description of the System The Template Matcher operates by trying to build &amp;quot;templates&amp;quot; from information it finds in the sentence. Based on an analysis of the types of sentences observed in the ATIS corpus, we devised four templates that account for most of the data: flight, fare, ground transportation, and meanings of codes and headings. We have recently added several new templates, including aircraft, city, airline, and airport. Templates consist of slots which the Template Matcher fills with information contained in the user input. Slots are filled by looking through the sentence for particular kinds of short phrases. For example, &amp;quot;from&amp;quot; followed by an airport or city name will cause the &amp;quot;origin&amp;quot; slot to be filled with the appropriate name. The sentence  Show me all the United flights Boston to Dallas nonstop on the third of November leaving after four in the afternoon.</Paragraph>
    <Paragraph position="7"> would generate the following flight template: If i ight, \[stops ,nonstop\], \[airline ,UA\], \[origin, BOSTON\], \[destination,DALLAS\], \[departing_after, \[16001\], \[date, \[november, 3, current_year\] \] \] The template score is basically the percentage of words in the sentence that contribute in some way to the building of that template. Given an input sentence, the Template Matcher constructs one template of each sort, and the one with the best score is used to construct the database query, provided its score is greater than a certain &amp;quot;cut-off&amp;quot; parameter. The cut-off parameter is what permits the risk trade-off mentioned above: the higher the cut-off, the more conservative the system is in attempting to produce a response. Words can contribute to a score in different ways: words that fill a slot (e.g., &amp;quot;Boston&amp;quot;) add to the score, words that help get a slot filled (e.g. &amp;quot;from&amp;quot;) also add to the score. Some words may not contribute to the interpretation, but nonetheless confirm the choice of a particular template (e.g., &amp;quot;downtown&amp;quot; for the ground transportation template), and hence are added to the score for that template.</Paragraph>
    <Paragraph position="8"> Other words are ignored for the purposes of scoring (e.g., &amp;quot;and, .... please, .... ok,&amp;quot; and &amp;quot;show&amp;quot;), since they do not tend to confirm particular templates.</Paragraph>
    <Paragraph position="9"> In certain cases the Template Matcher may modify the basic score of a template. Each template has a set of key words (or key phrases). The presence of these words or phrases in a sentence is a strong indication that the associated template is the appropriate one for that sentence. For the flight template, the keywords include words like &amp;quot;flight,&amp;quot; &amp;quot;fly,&amp;quot; and &amp;quot;go&amp;quot;; for the fare template, words and phrases such as &amp;quot;how much,&amp;quot; &amp;quot;fare,&amp;quot; and &amp;quot;price&amp;quot; are examples; for the meaning template, examples include &amp;quot;what is,&amp;quot; &amp;quot;explain,&amp;quot; and &amp;quot;define.&amp;quot; If none of a template's key words are present in a sentence then that template's score is docked by a certain keyword punishment factor, which varies from template to template. In most cases the lack of a keyword will prevent the associated template from scoring above the cut-off.</Paragraph>
    <Paragraph position="10"> There are two situations in which the Template Matcher will &amp;quot;abort&amp;quot; a given template, that is, give it a score of zero and cease processing it. First, if the system tries to fill a slot in a certain template with two different values, that template is aborted. Since we have no better than a fifty-fifty chance of guessing which is the correct filler, we are better off not attempting any answer. Second, if a template has no slots filled, it will receive a score of zero. This restriction is relaxed when the Template Matcher is operating in &amp;quot;context-dependent&amp;quot; mode, where follow-up questions are expected. A query like &amp;quot;show me the fares,&amp;quot; which would not fill any slots, would be much more likely as a follow-up question than as a context-independent query.</Paragraph>
    <Paragraph position="11"> Comparison with Other Systems Systems using the basic idea behind the Template Matcher go back as least as far as the SAM system at Yale \[2\], and include the Phoenix system at CMU \[3, 4\] and the SCISOR system at General Electric \[5\] as recent examples. There is also a degree of similarity to &amp;quot;case-frame&amp;quot;-based parsing methods \[6, 7\]. The main distinction is that the slots in our templates are domain-specific concepts rather than general linguistic or conceptual cases.</Paragraph>
    <Paragraph position="12"> Of these precursors, the Phoenix system seems most similar to the Template Matcher. Like the Template Matcher, the Phoenix system has templates (which they call &amp;quot;frames&amp;quot;) with slots that get filled with information from the sentence. The scoring mechanisms of the two systems are similar, but not identical. For both, the basic score of an interpretation is the number of words in the sentence that the interpretation accounts for. In the Phoenix system, for a word in a sentence to count for an interpretation's score, it must help fill some slot in that interpretation's frame. For the Template Matcher, the word will also count if it is an &amp;quot;ignore&amp;quot; or &amp;quot;confirm&amp;quot; word as discussed above.</Paragraph>
    <Paragraph position="13"> There are several other differences between the scoring mechanisms of the two systems: The Template Matcher punishes templates that do not have a keyword present in the sentence, and the Template Matcher requires that at least one slot in a template be filled. Also, the two systems behave differently when an attempt is made to fill a single slot with two different fillers. The Template Matcher will abort a template if this happens, while the Phoenix system will fill the slot with the second of the two possible fillers. The latter approach will handle certain types of false starts, but might be expected to yield more incorrect answers in other situations. Finally, CMU is not currently using a cutoff to weed out bad interpretations, although given the existence of a scoring mechanism in their system, this is something they clearly could do.</Paragraph>
    <Section position="1" start_page="190" end_page="191" type="sub_section">
      <SectionTitle>
Results
</SectionTitle>
      <Paragraph position="0"> After two weeks of development this system was tested on the June 1990 ATIS test set. This was a fair test to the extent that the implementor of the matching routines and the templates themselves (Jackson) had not examined the data from this test set prior to the evaluation. (Moore had noted, however, that the test set queries seemed amenable to a template-matching approach). For various values of the cut-off parameter we obtained the results shown in the following table.</Paragraph>
      <Paragraph position="1">  (These results were determined by visual inspection of the templates; the database retrieval code was not implemented at this point.) The conclusion we drew from this test is that a template-matching approach could quickly yield results that were competitive with the some of the better results reported in the original June 1990 ATIS test.</Paragraph>
      <Paragraph position="2"> After completing the implementation of the system and extensive development using the ATIS training data, we used the Template Matcher for the February 1991 ATIS class A evaluation, in both the NL and SLS tests.</Paragraph>
      <Paragraph position="3"> The results as measured by NIST are shown below.</Paragraph>
      <Paragraph position="4">  We used a cut-off of 0.8 for this evaluation, as we had previously determined from training data that this value should come close to optimizing the number of right answers minus the number of wrong answers.</Paragraph>
      <Paragraph position="5"> The system for the SLS tests was a serial connection of the version of SRI's DECIPHER system used in the ATIS SPREC evaluation and the Template Matcher described above. The answers reported in the SPREC evaluation were edited to be in lexical SNOR format and run through the Template Matcher exactly as in the NL tests. It is interesting to note the relatively small degradation from the NL to the SLS results, despite a 18.0 percent word error rate in the speech recognition; this seems to indicate the robustness of the Template Matcher to recognition errors.</Paragraph>
      <Paragraph position="6"> We had not planned to participate in the D1 evaluation, but at the request of NIST, we did those tests as well, taking context into account by using the answer to the first query in the D1 pair to restrict the database search in answering the second query, the same technique used in our ATIS demo system. In addition, the Template Matcher was run in context-dependent mode for the second query of each D1 pair. The results on the second queries of the pairs as measured by NIST are shown in the table below.</Paragraph>
      <Paragraph position="7"> Test Right Wrong No Answer NL only 22 3 13 SLS 15 11 12 We have not yet analyzed why there was a greater degradation in going from the NL to the SLS results in the D1 tests.</Paragraph>
    </Section>
    <Section position="2" start_page="191" end_page="192" type="sub_section">
      <SectionTitle>
Limitations
</SectionTitle>
      <Paragraph position="0"> In this section, we discuss some sentences that cause problems for the Template Matcher that are not easily resolvable.</Paragraph>
      <Paragraph position="1"> Show me flights returning from Dallas into San Francisco by ten P M.</Paragraph>
      <Paragraph position="2"> This sentence is a good example of the need for syntactic information. The problem is that the Template Matcher cannot tell that the phrase &amp;quot;by ten P M&amp;quot; modifies &amp;quot;returning,&amp;quot; and thus constrains the arrival time. By default, it treats the &amp;quot;by&amp;quot; phrase as restricting the departure time, and thus misinterprets the query.</Paragraph>
      <Paragraph position="3"> What is an A fare? The problem here is that &amp;quot;A&amp;quot; is ambiguous; it may be either the indefinite article or a fare class code. We have been forced to leave the fare class code &amp;quot;A&amp;quot; out of the Template Matcher lexicon. Adding it would do more harm than good, for we would then misinterpret every occurence of the phrase &amp;quot;a fare&amp;quot; (with the indefinite article), as in &amp;quot;Give me a fare from Boston to Dallas.&amp;quot; Syntactic information could help resolve this ambiguity, as could speech information, since the determiner &amp;quot;a&amp;quot; and the letter &amp;quot;A&amp;quot; have different acoustic properties. List the fares for Delta flight eight oh seven and Delta flight six twenty one from Dallas to Denver.</Paragraph>
      <Paragraph position="4"> Conjunctions of complex noun phrases are beyond the scope of the Template Matcher as it currently stands.</Paragraph>
      <Paragraph position="5"> The system could be modified to handle such phenomena, but an analytical grammar might be the more natural tool for the job.</Paragraph>
      <Paragraph position="6"> Do you have to take a Y N flight only at night? This is an example of a sentence where all the words contribute to a certain template (the flight template, in this case) and yet that template is not the correct one. A New Architecture As the examples in the previous section suggest, the Template Matcher by itself is probably not the ultimate solution to the problem of robust interpretation of natural-language queries. We believe that the template-matching approach and an analytical parser-based approach have complementary strengths and that an approach that combines both of them is likely to be ultimately superior than either one alone. We have therefore begun developing a new architecture for language processing in spoken language systems that combines the two approaches. Our basic strategy will be to use the analysis produced by the parser whenever we can, but to fall back on the Template Matcher when the parser-based system fails to produce a complete analysis. It is our conjecture, supported at least in part by the best results reported in the June 1990 ATIS evaluation, that an analytical, parser-based approach can be designed so that when it succeeds in providing a complete analysis of the input, that analysis has a very high probability  of being correct. With the Template Matcher it seems that there will inevitably be a larger possibility for error, because it uses strictly less of the information available in the utterance than a parser. In particular, our Template Matcher can ignore words; it ignores order; and it has almost no notion of structure. By using the Template Marcher as a backup to the parser-based system, we eliminate the possibility of the Template Matcher getting a wrong interpretation of something that could be successfully analyzed by the parser.</Paragraph>
      <Paragraph position="7"> A second reason for running the Template Matcher after the parser is to enable the Template Matcher to use partial results of parsing in its operation. Our current Template Matcher uses only single words and fixed phrases as key words or slot fillers. We are in the process of extending the Template Matcher so that it uses whole phrases that have been identified by the parser in attempting to analyze the entire utterance. For example, we saw that the Template Matcher is unable to analyze a phrase as complex as &amp;quot;returning from Dallas into San Francisco by ten P M.&amp;quot; Generalized to work from parsed phrases, the Template Matcher might be able to successfully interpret a complex utterance containing this phrase even if the entire utterance could not be parsed. Additionally, running the Template Matcher on parsed phrases should cut down on the sheer number of particular word patterns that have to be included in the template specifications.</Paragraph>
      <Paragraph position="8"> The use of robust interpretation methods changes the way in which the constraints embodied in a grammar are viewed. They must be treated as soft, rather than hard, constraints. This has significant implications for the rest of a spoken language system. If we want the parser to find grammatical fragments of the input that may be of use to the Template Matcher, then the parsing algorithm we previously used, which imposed strong left-context constraints, is no longer appropriate. We want something closer to pure bottom-up parsing to find all the phrases that the Template Matcher might use. We have developed such a parser, whose details are outlined in another paper for this workshop \[1\].</Paragraph>
      <Paragraph position="9"> Perhaps the most significant consequence of using robust interpretation methods in a spoken language system, however, is that the failure to find a complete parse can no longer be used as a hard constraint to reduce perplexity for the speech recognizer. An analytical grammar still contains valuable information that should be used by the recognizer, however. We feel that one promising approach to making use of this information is to extend the idea of a word-based statistical language model, such as a bi-gram model, to a phrase-based statistical language model, e.g., a &amp;quot;bi-phrase&amp;quot; model. The idea is simply to estimate the probability of occurrence of a particular type of phrase conditioned on the type of phrase that precedes it. In making this work effectively, however, it is important to include some lexical information in the categorization of phrases, usually information about the lexical head of the phrase.</Paragraph>
      <Paragraph position="10"> The ability of such a framework to capture long distance constraints not captured by N-gram models is illustrated by an utterance such as &amp;quot;What airlines that serve Boston fly 747s?&amp;quot; If we want to predict the likelihood of &amp;quot;fly&amp;quot; occuring in this context, the preceding word &amp;quot;Boston&amp;quot; gives us essentially no information. If, however, we have identified &amp;quot;What airlines that serve Boston&amp;quot; as a noun phrase whose lexical head is &amp;quot;airlines&amp;quot; then the likelihood of a verb whose lexical head is &amp;quot;fly&amp;quot; should be relatively high.</Paragraph>
      <Paragraph position="11"> The incorporation of a probabilistic element into the system raises a number of other interesting possibilities, including incorporation of probabilistic scoring based on observations of likelihoods of particular templates for sentences in the corpus, of particular slots for each template, and of particular words for each slot; and the possibility of using the Template Matcher itself as the basis of a statistical language model to guide recognition.</Paragraph>
      <Paragraph position="12"> Summary In sum, the Template Matcher represents a complementary approach to traditional natural-language processing. It has the virtues of robustness and broad coverage of many linguistic variants for requests for specific types of information. Although we have not discussed the issue of computational efficiency in this paper, the Template Matcher is noticably faster than a typical parser. The approach also has the advantage of rapid development time which should enhance portability to new domains.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML