File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2197_metho.xml
Size: 16,961 bytes
Last Modified: 2025-10-06 14:13:48
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2197"> <Title>A Bayesian Approach for User Modeling in Dialogue Systems</Title> <Section position="4" start_page="7272" end_page="7272" type="metho"> <SectionTitle> 2 Bayesian Networks </SectionTitle> <Paragraph position="0"> //ea~soning based (m prol)ability t.hem'y requires probahilisti(: models to bc specilled. In generM, a coral)lore lwol)M)ilistic model is sl)ecitied by the joinl: prob;LI)ilities of all random wn'iM)h~s ill the domahl. Tim l)rol)lem is th~tl; the coral)let(: Sl)ecilic~tion of the .ioint prol)abilities r(.'(lllil'eS a.1)suM amounts nf mlmbe.rs. For ex;unl)h; , (:onsi(ler \[.he (:~tse where Ml l'3AldOnl V;kl'ial)les are binary, having ~t wdlle 0 or l, the comlllete t)rol)Mfilistic model is Sll(!(:iti(~(l by 2 '~ - 1 joint 1)roba.bilities. (Assumiug &quot;n bimrry random wtriables, a:\], x~ .... xn, the distribution is :;pecitied by tit(! prol);></Paragraph> <Paragraph position="2"> sum up to unit, y so one of them can be automatically g~dned.) Moreover, in l)racl;it:e it is difficult 1;o explicitly specify the joint prol)Mfility. Concerning our purpose of modeling the user's knowledge, where a random variable corresponds 1;o a concept and whose value <:orresl>OlMS to the user's Mmwledge of the (:oncepl~, it is Mmost; imp<>ssit)le to specify MI joinl; probM>ili-.</Paragraph> <Paragraph position="3"> ties 1)ec~mse this involves cnumerat:ing all of the user's klmwledge t)~d;terus.</Paragraph> <Paragraph position="4"> llayesi;u, networks need fat\]: fewer \])robabilil;ies and CILI/ l)rovide the coinplete probabilistic luo(lels. The inform~fl:ion that (:Oml)ens~d;es \['or the g~t I) is qualit;> l:ive, which is obtMned I)y investigathlg the mtl:ure of I, he (loin;tin. The \]l~Ly('.sian neLwork h;ts both qualit~ttive and qmrntit;d;ive (:h;~ra(:teristi(:s, l.h('r('.fore, we CaAl rel)resenl; the knowledge quMitatively ;utd reason al)oti{; t)rol)M)ility (luanl;il;atively. Formally, l/ayesi~ul networks m'e directed m:y(:lic graphs (DAG) with the nodes ret~re.qent;ing ;~ ramdoln wu'ial)le and the dire(:tcd arcs representing the dirccl, del)endent re.la~ion bet:weet, t;he linked variables. It ;~ ;~rc goes from one nod(: to ;umther, we say l,hat the fornmr is a l);U'enl node of the. \[;tl;ter, and the btH;er is a (:hihl of l.hc former. The (list;ril)ut, ion on the network is specified to MI nodes :r its 1)rotlability t,(.:lp(.:)) (:on(lil;ioned by the set of its paren\[; lio(I(,.s p(x). The lio(l('.s without parents ~urc ~ssigned the l)rior 1)rob;d)ilities P(x). That is all |;h;d; is ne(:e,ssary for specifying ~ conll)lete t)robM)ilistic nm(lel \[:10\]. The reasoning \[m Bayesilm net:works (:orrespnn(ts to (.'valuating the posterior prol~al)ilit;y P(;r\[lC/) ml all nodes a: given lhe evidence. I'; that is Sl)ecilied hy providing certain values t.o ;~ cert;ain sul)se.l; of lmdes in th(, networks (fo,: illS|;;tll(:(!, \]'\] = {y = 1, Z&quot; -: 0} for some uodes y aud z). The cvMu;ttiOll of the nel,work is doue in generM by the st(ich;~st,ic simulation \[10\]. The upd;tl:ing of the u;;cr models are directly performed by ev;tllt~Ll;illg \[;he net;work once ghe. kn()wledgc of I;11(.' domain has 1)<~en corre<:l:ly represented t)y the /Ltyesialt nctw<)rk. In the next section, we discuss knowledge rel)resent;ttion with g;ty('.silm networks.</Paragraph> </Section> <Section position="5" start_page="7272" end_page="7272" type="metho"> <SectionTitle> 3 Knowledge Representation </SectionTitle> <Paragraph position="0"> with Bayesian Networks</Paragraph> <Section position="1" start_page="7272" end_page="7272" type="sub_section"> <SectionTitle> 3.1 DesigMng the Language </SectionTitle> <Paragraph position="0"> We haw; said the nodes ill the \]l;tyesian network are F~Lntl{)lll v;triables that r~tltge over sol,le vahles. In ol'del' to represent knowledge in terms of the l~tyesi~m net> work, we must design the l~ulgllage for the seutt.'nt:es assigned to the nodes of the network. We th'st assume t.ha.t the v,u'iMfles haw'. two lmssible values, so \[:hat thC/'. sentt'.uces have truth wtlues, tlutt is, :1. (trllc) or (I (fMse). Note thud; this ~tssumption is not cruciM; we m~g ~tssign values such ~ts KNOWN, NOT-KNOW, NO-\[NI:()I{MNFION as hi UMFE \[11\].</Paragraph> <Paragraph position="1"> The type of sentences may deI)end on tit(: application we pursue. For general explanation, it is important to make a (:lear distinction between tile two user's states; knowing tile name of a conceI)t and knowing the other attril>nte of tile coucel)t. For example, suppose the user asked the following: &quot;Where is FRISCO ?&quot; where FRISCO is the name of a record store. From this question, the system infers that the user knows the name of the store, but does not know its location.</Paragraph> <Paragraph position="2"> Now we will give a precise definition of our language.</Paragraph> <Paragraph position="3"> All the sentence, s in the language have the form ( la, beI) : (co,,,t,.,.t) where ( label ) is one of PRE, POST, JUDGE, TOLD, and TELL, and ( content ) is represented by a term ef tile first-order predicate' logic. An object and an expertise field are represented by an atomic symbol, and an attribute of an object is represented by a fimction syml)ol. For example, store001(object), records_collector(expertise field), location(store001)(attributc), and so forth.</Paragraph> <Paragraph position="4"> The user's knowledge about an attribute is represented by five sentences, all having the same (content) representing t.he attribute, and one of the five labels. The sentenees labeled PRE, express that the user knows the attrilLutc t)rior to the <lialogue session, while those labeled POST, express that the user has come to know it during the session. For instan<:e, PRE: location(store001) means that the user have ah'eady knows the h)catiou of store001 betorc the interaction starts, whih.' POST: location(store001) means the user has <:ome to know the location through the system's explanation. The sentences labeled JUDGE, express the user's (:urrent knowledge and is used to exploit tile user mo<lel by other coml><ments in the dialogue system. For instance, JUDGE: location(store001) means the use.r now knows tit(.' loca-tion of store001. The sentences labele<l TOLD an(l TELL, express the evi<le.nce, gained by the user's utterance and the system's explanation. F<Lr instance, TOLD: name(store001) means the user has iLLdicated by the clues that she knows the name of store001, while TELL: name(store001) means the system has explai,m<t the name. For exception, in the case of location, the form TELL: location(X)(whcre X is some obje(:t \[l)) is not usc<l because a location is explained in terms of the relative h)cation of another object. Instead, the form TELL: relation(X, Y)(where X and Y are some ol)ject IDs) is used.</Paragraph> <Paragraph position="5"> Tit(.' sentences representing objects and exi)ertisc fields have only the label PRE. The sentence representing an object (e.g. PRE: store001) means that the user knows the object, that is she knows ,nost of the attributes of the object. The sentence representing an expertise rich\[ (e.g. PRE: records_collector) means thai: the user is an exl)ert of the field, that is she knows the objects related to the expertise field.</Paragraph> </Section> <Section position="2" start_page="7272" end_page="7272" type="sub_section"> <SectionTitle> 3.2 Constructing the Networks </SectionTitle> <Paragraph position="0"> As mentioned, arcs of the Bayesian network represent direct probablistic influence between linked variables.</Paragraph> <Paragraph position="1"> Tim directionality of the arcs is essential for rei)resenting nontransitive dependencies. In order to represent the knowledge in terms of Bayesian Network, we must interpret the qualitative relation betwee.n the sentences that are represented by our language as a directed arc or some such combination of arcs.</Paragraph> <Paragraph position="2"> In our ease, the network has two sub-networks. One represents the user's knowledge be.fore the dialog session, which is used to guess the user's model fronl her utterances . The sentences assigne<l to the nodes in this part have either the label PRE or TOLD. We call this subnetwork the prior part. The other sulmetwork in which the nodes have either the label POST oi' TELL is used to deal wil;h tit(', influence of the system's utterances. This sulmetwork we call the posterior part. It is important t;o make a clear distinction. Considering that the system explains a concept, it is not proper to assume that the user knows some other related concepts. For example, if tile user utters that she knows some location x then it can be inferred that she also knows locations that are (:los(; to x. But that is not true if the location x is explained by the system.</Paragraph> <Paragraph position="3"> The relations ill the prior part of the network are categorized into four types as follows: (1) tl,e relations between objects in an expertise field (2) the relations between attributes of obje(:ts (3) the relations lmtween an ol)je<-t and its attributes (4) the relations betwee.n an att,'ibute of an object and the evi<lence that the user knows it The relations (1) are (:oncerL,ed with the expertise fiehl. The objects ill the same expertise field are related through the expertise field node. We introduce the arcs that go from the expertise tMd no<le to the obje<:t nodes belonging to that fiel(1. For example, ares go Dora the node of &quot;records collector&quot; to that of &quot;Compact Disk&quot;,&quot;Tower Records&quot; (name of a record store) and so on. The level of expertise can be controlled by the conditi<mal probal)ilities assigned to the object nodes conditioned by tile ext)ertise tMd node. In this framework, we can intro<hLce arbitrary numbers of expertise fiekls, all of which can be assigned the level of expertise.</Paragraph> <Paragraph position="4"> '\]/he re.lations (2) are conce.rned with the <lolnain knowledge. In our domain, those are the relations between the locations, whi<:h are based on the assumption that the user l)robably knows the locations close to the location she known. TILe relations are assunn.'d to be symmetric. A single directe<l arc of Bayesian networks does not represent a symmetric relation. In ordeL' to rel)resent a symmetric relation, we introduce a dummy evi(tence node, whereby two arcs go forth from the two location nodes as shown in figure 1. The prior conditional probabilities of l;hc dummy node lutve high wdue it' the two parent nodes h~tve the same wdue.</Paragraph> <Paragraph position="5"> The relations (3) are (:on(:erned with g(:ner~d knowledge, such ;ts knowing ;m obj(!ct well imt)li(:~d;cs know. ing its ;d;tril)utes. In order to rel)resent such kiltd of I'(!l;ttio,ls, WC ill\[to(hi(:(; the ~tl'(:s to go fl'Olll the ,lode of ~m object to the nodes of its ;tttributcs.</Paragraph> <Paragraph position="6"> The arc ec)rresponding to the relation (4) is introdu(:e(l, to go frmn the node of an al.trilmte of an ollj(~ct to an evidence node. The ;~ttribul.e nolle ~utd the eviden(:e node have the s~mm ('ontent, whih, they h;Lve the different bd~els, PRE and TOLD.</Paragraph> <Paragraph position="7"> Iu tim l)OSterior l)i~rt of the network, the.re ~tr(,. only ;~rcs rci)resenting the relations (4). The ;d;tribul;e nodes ~md the evidence lmdes are lalmle(l POST ~md TELL. In a(hlition, tile TELL node. Ill;-ty ll.~tve lllOl'e. I;h;tn Ol,(! it;\[reid; ,lode \])(!CaAlS(~ th('. (!Xl)l}tll}ttiOll8 of the att|'ilmt(; are m;t(le l)y referring to the other attributes. Actually, ill ()Ill' towtl gllid~t,l(:(! (lonudn, the syst(;m explains the new ht(:~ttkm using |;Ill; locations that the user already knows. Fro' instance, the nodes POST: h)cation(store001) and POST: location(store0()2) ~tre l)iU'ei,ts of the. llode TELL: relation(store001~ store002) whe.n the system ('.x-Ill;tin till! location of store001 by using the lo(:~tti(m of store002. The. more the system shows the l'el~d:ions, the deeper the user's un(lerst;ul(ting bc(:on~(~s.</Paragraph> <Paragraph position="8"> The ~unbiguous e.videnee (:~ul lm dealt with str~ightforwardly ill tit(; tl;tyesi;ul al)l)ro~(:h. All evidence l,o(le Citll luwe lllore th~tll Ol,(! l)a,l'eltt llo(le, to re,1)r(> sent the ambiguity. F(lr exam,pie, when (le~ding with Sl)oken inputs, it might be ~md)iguous tit;d; the user said either &quot;tower recor(ls&quot; ()r &quot;power records.&quot; If both r(.'cord stores exist, an evidence uode hd~c'le.d TOLD is intro(luced as ;~ oh|hi node for both no(les, PRE: name(tower) :rod PRE: name(power) (figure 2).</Paragraph> <Paragraph position="9"> Fimdly, wc introduce the ~u'(:s that conne(:t the two subnetworks. For each ~ttribute., there ~n'e three kinds of n(l(les lalleh,.(l PRE, POST, ltll(l JUDGE. The two arc are (lraw,t from the PRE node to the JUDGE node,rod the POST node to the JUDGE nolle. That means the user knows the attribute either 1)e.c~mse he alrea(ly knew it before the current (li~dogu(! sessi()n or because it has been exi)l~dned by the system during 1;he session.</Paragraph> <Paragraph position="10"> Tim ex~mxI)le of the resulting network is shown ill</Paragraph> </Section> </Section> <Section position="6" start_page="7272" end_page="7272" type="metho"> <SectionTitle> 4 Examples </SectionTitle> <Paragraph position="0"> Suppose the user ~tsks the sysLe, lll to show the w;ty to :-t record store l|~ulle, d FRISCO ill ,% towll (figure 4).</Paragraph> <Paragraph position="1"> The systmn uses the Imtwork ill ~igllr(! 3. The diM.gue st~u'ts with the user's reqllt!st.</Paragraph> <Paragraph position="2"> (1) user: Wht!re is FRISCO? in l)rat:tise, the input ~m~tlysis (:Omlmnent is needed to obt:-tin cvident:cs of the uctwork \['l'Oll\[ I;}l(! user's tlt~(!l'~tllC(!S, lint this 1)ro(:ess is b(!ymul the scope of this paper. By amdyzing the inlmt , the system obtains the inforuu~t;ion th;tt the user knows the ,l}l, llle Of a (:err&ill store~ \[)Ill; do(!s ilot klloW its loc~ttion, The. input;, i.e. the evidence, to the network is .E = {T()LD: name(frisco) = I, TOLD: location(frisc.o) = 0}. Evalu~tting the degree of belief of elt(:h con(:el)t :r by using the llOSl;erior 1)rob~d)ility l)(:rl TOLD: llanle(frisco) = \], TOLD: location(frisco) -- 0) gives the resulting user model. Though this result (:;m bc directly obtaine(l by evalu.. ittiug the network, we will briefly tra.ce our reasoning for expl~m~tory l)urposes. (NoLe that tim actmd pro(:ess is l,Ot (!3,sy to Cxl)lain ~ts all nodes of the netwm'k influence e.;L(:h other, th;d: is till; reason why simulation is nee(led for ('wduation.) The user knows th(; ,stole FRISCO, which l'('.p,'esents that she has the high expertise level f()r records colh;(:tors and r~dses the t)rob~d)ility of the node PRE: record.s_collector a,n(l ~tlso raises that of the node of other re<:l\[rd store.s, Tower R.ecords(Pl{E: tower), W~we Records(PRE: waw'.). These nodes then ~dI'e<:t thl'. n<)de <If their attributes, PRE: location(tower), PRE: name(tower), eRE: lot.at|on(wave), ~u,t s<) on. TluLt :';dses the 1)robal)ility of the l<)<:ation node HANDS l)ct)artment (PRE: bleat ion(hands)), whi(:h is close to the loc;d;io|t the user (l)rOb~dfly) knows, i.e. PRE: lo('ation(wave).</Paragraph> <Paragraph position="3"> Next, the systmn gene.r;ttes the answer by using tim resulting us(!r model. This |;ask is done 1)y at i)la,nner for utterance generation. The system nu~y (h~cidc to use the. h)(:~ttion of HANDS.</Paragraph> <Paragraph position="4"> (2) systmn: It is 300m to lhe smd:h frmn (3) user: 1 don'l; know whe:e \]\[ANI)S is.</Paragraph> <Paragraph position="5"> This input gives (;he sysl:em l.he evidence, TOLl): location(hands) := 0. After obtaining this evidence, l;he belief is revised. The probability of Lhc node PRE: location(hands) falls, which in turn causes l:he prol~-Mfility of the node PRI'?,: location(wave) to fMl. Next, the i)lammr ll\]~ty t;ry 1:o explain the loc~tl.ion of \]IANI)S, by using l:he, location of Tower I/e, cords whidt gives the evidence TEI, L: relat;ion(hands~tower)-~ 1.</Paragraph> <Paragraph position="6"> (4) sysl:em: \[lANDS is l.wo blocks away t~o l;he wesl; fronl &quot;Power llecords.</Paragraph> <Paragraph position="7"> This expla,ation not; only can influence t;he user's undersl;mMing of the lo<-al;ioll of IIANDS bul; also the local.ion of FI/ISCO, because the evidence raises the posterior prot)alfilit.y of the node POS'D: loeation(ti'isco) t.hrongh the. node POST: location(hands). null \]i\]vMual;ilm resull;s of lhe above diMogue are shown in 'P~d~h! 1.</Paragraph> <Paragraph position="8"> lll~t\]\[) of a, tOWll</Paragraph> </Section> class="xml-element"></Paper>