File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-1068_metho.xml
Size: 13,266 bytes
Last Modified: 2025-10-06 14:14:14
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-1068"> <Title>A CD-ROM Retrieval System with Multiple Dialogue Agents</Title> <Section position="3" start_page="400" end_page="400" type="metho"> <SectionTitle> 2 The baseline system: TARSAN </SectionTitle> <Paragraph position="0"> We have been constructing a spoken diah)gue system which retrieves inforlnatlon froln a large anlount of texts contained ill CD-ROMs, named TARSAN(Sakai et al., 94; Sakai et al., 95). Figure 1 shows the contiguration of the baseline syso tern TARSAN for multiple domains.</Paragraph> <Paragraph position="2"> tiple domains TARSAN retrieves the information using the folh)wing processes: 1. Tile inlmt analyzer analyzes the result of the speech recognition or the sentence re(:eived frOlll keyboard.</Paragraph> <Paragraph position="3"> 2. The intention extractor extracts the user's intcntion (i.e. question, answer, (:ondition (:hange, and so on) 1)ased on the analysis of the modality.</Paragraph> <Paragraph position="4"> 3. The uttera.nce l)air controller deals with not only a silnl)le pair of QA I)ut also deals with tollow-u 1) questions bused on utteran(:e lmir controlling.</Paragraph> <Paragraph position="5"> 4. The retrieval (:ondition maker makes retrieval conditions which is sent to the fnll text retrieval 1)rocess by. the dialogue controller de.scribed below. The retrieval conditions are created 1)y refl'xring the 'text-models', which define the relation betwcell the inlmt words and the retrieval conditions.</Paragraph> <Paragraph position="6"> 5. The 1)araphr~user translates various CXl)ressions of the inputs into a single donlain oriented con(:el)t.</Paragraph> <Paragraph position="7"> 6. The diah)gue controller dctcrlnines the sys- null tem's l)ehavior (to retrieve and to answer the result, or to request lnorc rctricvM conditions to the user) by referring the retrieval conditions and the diMogue strategy.</Paragraph> <Paragraph position="8"> 7. The outlmt generator generates the output sentence, to be announ(:ed by the text-to-speech 1)rocess and the information to be dis1)layed on the monitor.</Paragraph> <Paragraph position="9"> Our current system TARSAN is able to access the folh)wing four CD-ROMs: CD-ROMI: sight-seeing infl)rmation in Japan (i.e. name, locat.ion, explanation, and so on of temples, hot springs, golf courses, and so on)(Kosaido, 90).</Paragraph> <Paragraph position="10"> CD-ROM2: hotel inforlnation in Japan (i.e.</Paragraph> <Paragraph position="11"> name, telel)hone number, room charges, equip,nent, and so on)(JTB, 92).</Paragraph> <Paragraph position="12"> CD-ROM3: Japanese and foreign cinema information(i.e, tith', cast, director, story, and so ,,n)(PIA, 90).</Paragraph> <Paragraph position="13"> CD-ROM4: Jalmnese professional baseball player information(i.e, name, belonging team, records, and so on)(NMfigai, 90).</Paragraph> <Paragraph position="14"> TARSAN treats cD-ROM1 and 2 as a single travel domain, CD-R,OM3 as a cinema domain, and CD-ROM4 as a baseball domain.</Paragraph> </Section> <Section position="4" start_page="400" end_page="401" type="metho"> <SectionTitle> 3 ProMems </SectionTitle> <Paragraph position="0"> As we described ill the introduction, we have addressed three main l)robh'ms ill our (liMogue. systenl. Two problmns derive froln the extension of the system to multil)h', domains. And the last Olle derives from the single path contextual managelllellt, null 1. The first problem is that the user nfisunderstands that the information contained across several data sources call be obtaincd by a siltgle input sentence. The fl)llowing are exam1)les of requests ac(:ross domains: The first e.xample is contained in the cinema (lomain and in the travel domain, and the second examl)le is contailmd ill the b~Lseball dolnmn and in the cinema domain.</Paragraph> <Paragraph position="1"> l'\]xamph~ l: &quot; tStm, ag,..ch, i MoTnoe. ga .sh..uen sita ciga no b'utai &quot;hi natta onacv, wo ,shiritai.&quot; (i want to know tim hot sl)ring which is th(~ scene of the cinema whose s|,ar is Ymnaguchi Momoe.) ah'u,t,vu, en sita eiga wo o,shiete.&quot; (Tell m(&quot; the cilteln;t whore ;nl actor who was a profestdonM 1)~usel)all player performs.) derstands that the system h~ an Ml-powcrfifi strategy, if it has a robust strategy for a certain purpose. Suppose that several discourse strategies exist in a single dialogue agent: one is a very sophisticated but very goal specific strategy which allows the user to reach the goal immediately, and another is a very simple but redundant strategy which has the ability to achieve any kind of goal. In this case, the user may conflme the potential of these strategies and feel uncomfortable about the gap.</Paragraph> <Paragraph position="2"> 3. The last problem is that the user has to manage multiple contexts concerning to multiple goals, because the system isn't enough robust for anaphora and only manages a single context. And this makes it hard for the user to use the system. Table 1 is an example that the user compares the information between Hakone and Nikko 1. The example shows that the user ha.s managed the context himself, which seems very complicated.</Paragraph> <Paragraph position="3"> We have ,also assmncd that these three problems arise because the system only has a single diMogue agent. A single dialogue agent usually deals with everything and this makes the user invisible what the system can or cannot do. Thus, we propose a new diMogue system with multiple agents which make the system's ability more visible to the user.</Paragraph> </Section> <Section position="5" start_page="401" end_page="403" type="metho"> <SectionTitle> 4 Dialogue system with multiple </SectionTitle> <Paragraph position="0"> dialogue agents In this section, we introduce a new dialogue system with multiple, dialogue agents. The purpose is to make the user aware of what the system can or cannot do. In our system, three types of dialogue agents are realized: 1) for each donaain, 2) for each strategy and 3) for the each context. Here, we call these agents as 1) domain agents, 2) strategy agents, 3) context agents, respectively.</Paragraph> <Paragraph position="1"> Figure 2 shows a brief sketch of these three types of agents. These agents take turns and play their 'usrl: Hakonc ni aru onscn wo oshiete.</Paragraph> <Paragraph position="2"> (Tell me the hot springs in Hakone Town.) sysl: 16 ken arimasu.</Paragraph> <Paragraph position="3"> (There are 16 hot springs.) usr2: Nikko deha. (How about in Nikko?) sys2: Chuuzenji onsen, Nikko yumoto onsen ga arimasu.</Paragraph> <Paragraph position="4"> (There are Chuuzenji onsen and Nikko yumoto onsen.) usr3: Hakone niha jiin ga arimasuka.</Paragraph> <Paragraph position="5"> (Are there any temples in Hakone?) sys3: Amida dera, Kuduryu Myojin, Saunji nado 7 ken arimasu.</Paragraph> <Paragraph position="6"> (There are 7 temples; Amida dera, Kuduryu Myojin, Saunji, and so on.) usr4: Nikko niha. (How about in Nikko?) roles according to the discourse situations. The details of these agents are as follows.</Paragraph> <Section position="1" start_page="401" end_page="402" type="sub_section"> <SectionTitle> 4.1 The domain agents </SectionTitle> <Paragraph position="0"> To solve the first problem, we realized domain agents which perform information retrieval ill each different domain. Figure 3 shows a brief sketch of the domain agents. The domain agents perform the basic interaction between the user and the system to retrieve the information in the basic manner specific to each domain. In every domain agent, indispensable and basic conditions for information rctrievM are defined. Using these conditions, the domain agent communicates with the user and performs the information retrievM.</Paragraph> <Paragraph position="1"> And when the user's input 1-noves from one domain to another domain, the domain agent will also change. Thus with the domain agents, the user is made aware of the boundary between the domains. We expect this mechanisnl to prevent the user from asking the question across uninte- null grated multiple domains. For exmnl)le, in the case of the example 1 in section 3, two agents dealing with the <:inema domain and the travel domain try to make each action as Table 2 shows 2. Thus the, user will be aware of the boundary between the two domains.</Paragraph> </Section> <Section position="2" start_page="402" end_page="402" type="sub_section"> <SectionTitle> 4.2 The strategy agents </SectionTitle> <Paragraph position="0"> To solve the second probleln, we reMized the strategy agents which 1)crforins informatioll retrieval according to each specific strategy for the information retrieval. Figure 4 shows a brief sketch of the strategy agents. The strategy agents handle the interaction between the user and the system to retrieve the information in the manner specific to each task. In every strategy agent, task specific conditions for tim information retriewd are defined. Using the task specifc conditions, the strategy agent is al)le to use the default condition specific to the task and is able to give advice or t<> give choices to the user. Thus with the strategy agents, the user is made aware of the strategy which is specific to the task an<l this mechanism prcvcnts the user using the task specific strategy for other tasks.</Paragraph> <Paragraph position="1"> In the current system, there are two strategy agents for the travel dmnain: 2Travel agent is able to retrive and find &quot;the hot spring which is the scene of Izu no odoriko&quot;.</Paragraph> <Paragraph position="2"> business trip strategy agent: indispensable c<mdition for the inlmt is the destination, and the optional con<liti<)ns are the room charge and the circumstances. When the optional con<litions arc not defined by the user, the strategy agent will rex:olmncnd some choices to the user. The default responses arc the name of the hotel and its telephone number in this task.</Paragraph> <Paragraph position="3"> recreation strategy agent: indisl)ensablc condition for the input is the recreation equipment and the number of participants and the other conditions are optional. When the optional conditions are not defined by the user, the strategy agent will recommend some choices to the user. The default responses are also the name of the hotel and its telephone lmnJ)er in this task.</Paragraph> <Paragraph position="4"> These strategy agents not Olfly allow the user to use the system easily 1)ut ~dso hell> the user to 1)e aware of the <:haraeteristies of the diah)gue strategy specific to the task.</Paragraph> <Paragraph position="5"> Table 3 ct)mpares the difference between using the domain agent for travel and the business trip strategy agent. As you can see from the table, more frielldly discourse is achieved when using the strategy agent.</Paragraph> </Section> <Section position="3" start_page="402" end_page="403" type="sub_section"> <SectionTitle> 4.3 The context agents </SectionTitle> <Paragraph position="0"> To solve the last l)rol)lem, we realized the context agents which perform the information retrieval dclmndent ill different contexts. The context agent is defined when the user n mves from one context to another. Figure 5 shows a brief sketch of the context agents. Using the context agents the user can easily compare the re, sults relating to lnultiplc goals.</Paragraph> <Paragraph position="1"> Table 4 is the dialogue which aims at the same goal ~us table \] by using multil)le contex:t agents. As you can see from the table 1 and 4, more sim1)lified discourse: is achieved when using context agents.</Paragraph> <Paragraph position="2"> using the domain agent and the strategy agent domain agent for travel agt0: Hal, Ryokou eejent desu.</Paragraph> <Paragraph position="3"> (Hello, I'm travel agent.) usrl: Toitori ahi no hoteru wo shiritai.</Paragraph> <Paragraph position="4"> (I want to know the hotels in Tottori city.) agtl: 34 ken arimaau. (34 rows exist.) usr2: Ekimae ni aru hoteru.</Paragraph> <Paragraph position="5"> (Land cireulnstatlee is neat' the station.) agt2:8 ken arimasu. (8 rows exist.) usr3: shukuhaku ryou ga 8000yen ika deha.</Paragraph> <Paragraph position="6"> (And room charge is under 8000yen.) agt3: Hoka no jouken &quot;Ricehi jouken ga ekimae&quot; wo nokoshi masulca.</Paragraph> <Paragraph position="7"> (Do you keep another condtion, &quot;Land ciremnstanee is near the station&quot;?) usr4: Hal. (Yes.) agt4: AAA h, oteru, hoteru BBB desu.</Paragraph> <Paragraph position="8"> (AAA hotel and hotel BBB exist.) usr5: Sono denwabangou wo oshiete.</Paragraph> <Paragraph position="9"> (Tell me the telephone nmnhers.) agt5: AAA hoteT~t no dcnwabangou ha XXX-XXXX, ho~er~ BBB no denwabanou ha YYY. YYYY dean.</Paragraph> <Paragraph position="10"> (The telephone numher of AAA hotel is XXX-XXXX, and the one of hotel BBB is Hai, Shucchou eejen~o dean. Dochira ni shucchou nasai masu ka.</Paragraph> <Paragraph position="11"> (Hello, I'm business trip agent. Where at'e you goingo to go?) Tot~ori shi dean. (Tottori city.) Riechi joukcn ha ekimae, shukuhaku ryou ha</Paragraph> </Section> </Section> class="xml-element"></Paper>