File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/94/h94-1037_evalu.xml

Size: 11,548 bytes

Last Modified: 2025-10-06 14:00:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="H94-1037">
  <Title>PEGASUS: A Spoken Language Interface for On-Line Air Travel Planning I</Title>
  <Section position="5" start_page="203" end_page="204" type="evalu">
    <SectionTitle>
EVALUATION
</SectionTitle>
    <Paragraph position="0"> PEGASUS first came into being in January 1993. Since then, we have been actively improving and extending its capabilities. Thus the system is in a constant state of flux - deficiencies are corrected as new capabilities are introduced. Nevertheless, it is fully functional in the sense that members of our group have been able to use it to make actual travel arrangements since last spring, using naturally spoken English. Even though it is definitely premature to accurately assess the usefulness of the system, we have recently begun to formally monitor its performance longitudinally by keeping a time-stamped log file of all transactions. In this section, we will present some very preliminary results on the system's performance since early fall, 1993. The results are obtained from ten bookings made by eight members of our group in order to satisfy their real travel needs. All of them represented round-trip bookings from one city to another. In some cases, the time for travel was important whereas in others, the cheapest airfare was desired. Seven of the ten bookings were successfully completed. Statistics on some of the objective measures for the successfully completed bookings are shown in Table 1.</Paragraph>
    <Paragraph position="1"> Averageci across all six subjects who completed the bookings successfully, it took almost 25 queries and more than 13 minutes for the subjects to complete a booking.</Paragraph>
    <Paragraph position="2"> It is interesting, however, to compare the statistics of the three experienced users 2 with the other three, who were using the system for the first time. Compared to the naive users, the experienced users completed the bookings with considerably less effort - using less than one-third of the number of queries and taking one-fourth the amount of time. The variations in their performance are also considerably less. In general, one can expect the system's performance on totally naive subjects to degrade.</Paragraph>
    <Paragraph position="3"> On the other hand, the results give us hope that experienced travellers can learn to put PEGASUS to productive use, once they become familiar with its capabilities.</Paragraph>
    <Paragraph position="4"> We also examined the log files for the three unsuccessfi.fl bookings in order to identify the system's shortcomings. In one case, the user successfully completed the forward leg of a trip, but the system booked an erroneous return leg, causing him to start over. He cleared the discourse history, but did not explicitly cancel the booking on EAASY SABRE. Thus, even though the user successfully booked the flights he wanted, EAASY SABRE was unable to reconcile the double booking on the forward leg. In the second case, the user initially selected a. fare that was incompatible with his travel plans. He did not successfully cancel his initial reservation or clear the discourse history. The. system continued to enforce the restrictions on the previous fare, even though he at- null tempted to rebook with an unrestricted fare. In the third case, the discount fare selected for the forward leg was not available on the return flight. Both the second and third users eventually gave up in frustration.</Paragraph>
    <Paragraph position="5"> Since mid January, we have begun to save the speech waveform, in addition to the log-file. We were thus able to also measure the system's speech recognition performance. The word and sentence recognition error rates for these bookings were found to be 10.6% and 28.6%, respectively.</Paragraph>
  </Section>
  <Section position="6" start_page="204" end_page="205" type="evalu">
    <SectionTitle>
DISCUSSION AND
FUTURE PLANS
</SectionTitle>
    <Paragraph position="0"> This paper describes our recent effort in developing a spoken language interface to an on-line, dynamic airline reservation system. By leveraging off our ATIS development effort and paying particular attention to dialogue management, we were able to produce a working interface that enables users to make real flight bookings using spoken language.</Paragraph>
    <Paragraph position="1"> PEGASUS is the outcome of a new research strategy that we have adopted, one that strives to develop language-based technologies within the context of real application back-ends, rather than relying on mock-ups, however realistic they might be. We believe that this strategy will force us to confront some of the critical technical issues that may otherwise elude our attention, such as dialogue modelling and new word detection/learning.</Paragraph>
    <Paragraph position="2"> We also believe that the time is ripe for us to begin demonstrating the usefulness of these technologies. Working on real applications thus has the potential benefit of shortening the interval between technology demonstration and its ultimate use. Besides, real applications that can help people solve problems will be used by real users, thus providing us with a rich and continuing source of useful data.</Paragraph>
    <Paragraph position="3"> While we are encouraged by our initial success with PEGASUS, much work remains to be done. One of the major deficiencies of the system is its inability to gracefully coerce the user back on track when his/her request cannot be satisfied. A common problem arises when the cheapest fare that the user specified is not available on the selected return flight. The user is faced with the multiple choices of modifying his/her choice for the flight, date, or fare. Rather than leaving the user to explore all these dimensions freely and run the risk of confusion, a more productive solution may be for the system to take control of the dialogue by offering explicit choices. Of course, the user should still be free to diverge from the computer's goal whenever he/she so chooses.</Paragraph>
    <Paragraph position="4"> Until very recently, the system's knowledge has been limited to fewer than sixty major cities in North America, Europe, and Japan. We have just expanded PEGASUS'S knowledge base to more than 220 major cities worldwide.</Paragraph>
    <Paragraph position="5"> Nevertheless, it is still a very small set considering that EAASY SABRE contains flight information for nearly two thousand cities worldwide. Rather than making all the cities, airports, and airlines available with equal probability at all times, we will explore ways to constrain the search while maintaining full flexibility. One possibility is to allow a user to customize the system to suit their needs. Thus, for example, a user could specify the cities and airlines that they care about, in much the same way they presently specify their frequent flyer number, seating preferences, and credit card information for billing. The system will need to be supplemented with tools that will enable users to interactively and incrementally add appropriate information. In addition, the system could also automatically adjust language probabilities based on the user's dialogue history.</Paragraph>
    <Paragraph position="6"> At the moment, the system can only book a single seat under the name of the user currently logged onto EAASY SABRE. In the future, we would like to add the capability of changing the name on the ticket, or booking multiple tickets for the user and accompanying family members, for example.</Paragraph>
    <Paragraph position="7"> The present implementation of PEGASUS assumes that information is provided to the user both visually and aurally. This assumption obviously affects significantly the nature of the responses generated by PEGASUS. For example, the system will currently say, &amp;quot;Here are the flights from Boston to San Francisco on October 20,&amp;quot; and proceed to display them. We believe that there will be many occasions in which a user may be communicating with the system by telephone. In such a case, the information must be presented in a different manner (e.g., &amp;quot;There are seventeen direct flights from Boston to San Francisco on October 20.&amp;quot;) The resulting human-computer dialogue will be quite different from that in our current implementation. We intend to pursue such a &amp;quot;displayless&amp;quot; implementation in the future, eventually leading to the development of telephone-based applications.</Paragraph>
    <Paragraph position="8">  Our experience in designing PEGASUS has led us to the realization that considerable care must go into providing mechanisms to easily manage and maintain dialogue coherence. While our dialogue states are a convenient representation, the current mechanism for controlling them is becoming unwieldy, and therefore needs to be reorganized prior to adding some of the enhancements mentioned here.</Paragraph>
    <Paragraph position="9"> Through our experience in developing a preliminary version of PEGASUS, we discovered that the capability to specify the dialogue flow explicitly at some high level is necessary, in order to be able to understand and manage the dialogue effectively. To that end, we recently redesigned the PEGASUS control strategy, so that dialogue moves conditioned on prior states can be conveniently specified in tabular form.</Paragraph>
    <Paragraph position="10"> An example entry from our newest implementation is shown in Table 2. This entry states that when the user has just completed a successful booking, the system should examine the conditions in the Order presented and take the appropriate action when they are met, setting the dialogue state to the new value, if appropriate. Thus, in our example, once a flight has been booked, the first thing the system does is check to see if there is a first-leg flight associated with the current one (i.e., &amp;quot;Has-firstleg?&amp;quot;). If so, the system performs the actions associated with concluding a booking (e.g., summarizing the flight information) and resets the dialogue state to anticipate a completely new exchange. If the first condition is not met, the system proceeds in the same manner through the others in the order given.</Paragraph>
    <Paragraph position="11"> Ultimately, we would like a dialogue framework that is domain independent. We have begun to define a dialoguedescription language in which different types of user interactions can be represented. The terminal nodes of the grammar would be associated with user query classes.</Paragraph>
    <Paragraph position="12"> User interactions expected within a particular domain would be described in this meta language, and that description would be used by the system to direct the human machine interaction.</Paragraph>
    <Paragraph position="13"> There has been some theoretical work on the structure of human-human dialogue \[12\], but this has not yet led to effective insights for building human-machine interactive systems. We believe it should be possible to define a hierarchy of dialogue types: for example, the air travel dialogue is an instance of a more general transaction dialogue in which the user acquires information about the choices available, commits to a purchase, perhaps authorizes payment, and verifies the entire transaction. It should be possible to compile a domain-specific dialogue model from a general transaction dialogue framework and a description of the particular sub-domain.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML