File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-1052_metho.xml

Size: 17,072 bytes

Last Modified: 2025-10-06 14:14:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-1052">
  <Title>A Computational Model of Incremental Utterance Production in Task-Oriented Dialogues</Title>
  <Section position="5" start_page="304" end_page="305" type="metho">
    <SectionTitle>
3 Discourse Structure Analysis
</SectionTitle>
    <Paragraph position="0"> We analyzed the discourse structure in a corpus of task-oriented diah)gues, which were collected by the folh)wing method. The subjects were ninety native Japaimse. In each diah)gue, two subjects, N and E, were ~Lsked to converse by telephone to lind a solution to the l)roblem of how N could get from one place to another. Subjects were chosen such that E had enough knowledge to solve the problem but N did not. Eigilty dialogues were recorded and transcribed. Fifteen dialogues were randomly chosen for analysis. The discourse structure was analyzed in terms of information units and discourse relations.</Paragraph>
    <Paragraph position="1"> a.1 Analyzing information units Speakers organize tile information to t)e conveyed to information units (Halliday 1994), which are the units for traitsmission of information. The information units (IKs for short) are regarded as minilnal components of discourse structure. We assume that IUs a,re realized by grammatical lievices: a clause realizes an 1 U, an inteljectory word realizes an 1U, and a tiller term shows the end of an IU. Figure 1 shows pa.rt of the transcription of a dialogue where a diahlgue participant prol)oses a domMn l)lan. Tile symbol &amp;quot;/&amp;quot; separates the IUs.  Tal)le 1 shows the frequency distribution for tim /~ramnlatiea\] (:ategoric's of IU, where NP and I'P mean noun 1)hrase and 1)ostl)ositional phrase. '.\['he average nmnber of NPs in an IU as a clause, NP, Pl', or sequen(-e of NPs and Pl?s is 1.01 in the tifteen dialogues. The vm'ianee is 0.28.</Paragraph>
    <Paragraph position="2"> This reslflt indicates that small IUs are frequently used. For example, althougil IU (1) in Figure 1 descril)es only a part of a domain action, it is regarded as ail IU siil.ce it has a e(/pula (&amp;quot;desu&amp;quot;) an(1 a sentenee-linal t)article (&amp;quot;ne&amp;quot;).</Paragraph>
    <Section position="1" start_page="304" end_page="305" type="sub_section">
      <SectionTitle>
3.2 Analyzing discourse relations
</SectionTitle>
      <Paragraph position="0"> Discourse relations between adjacent discourse segments w(,.re examined. A (liscourse segment is an IU or a sequence of IUs. For discourse reta.tions, we here adot)ted those used in Rhetori(:al Structure Theory (M~nn and Thomt)son 1988) and tlere followed Hovy (1993) to classify them into semantic and interpersonal ones. Figure 2 shows discourse relations tllat appear in the discourse displayed in Figure 1. The small IUs are hierarchically related. This results ill the fine structure of diseom:se.</Paragraph>
      <Paragraph position="1"> Table 2 shows tile frequency distributions for discourse relations in tile fifteen diak)gues. Let us consider the role that tile l)redominant relations, Elaboration, Circumstance, and Motivation, l)lay in tile inereinental strategy of utterance t)roduction.* First, Elaboration is exploited to describe domain actions, states or objects in a piecemeal fashion.</Paragraph>
      <Paragraph position="2"> Elaboration enables speakers to distribute the content to be conveyed among different lUs. '_Ptfis relation is useful for the incremental strategy since it allows speakers to begin uttering even when the content has not been fully determined.</Paragraph>
      <Paragraph position="3"> Second, Circumstance is the relation between two segments, a nucleus and a satellite. The m&gt; eleus describes a domain action or state. The satellite describes the circumstances where the m&gt; cleus is interpreted, such as the preconditions of a domain action. There are 41 cases where the satellite describes a precondition of a domain action, which amounts to 68% of all cases. The constituents of a domain action are often referred to in its preconditions. We see ~ typical ease in tile relation l)etween (4) and (5) in Figure 1. (5) describes the action of getting on a bus and (4) de- null scribes the existentional status of the bus ms the precondition of the action. By utilizing this relation, speakers can distribute the content of a domain action between two IUs. They can pick up a constituent of an action and describe it before describing the whole content of the action. Thus Circumstance is useful for the incremental strategy.</Paragraph>
      <Paragraph position="4"> Finally, Motivation is mainly used for describing a domain action as a nucleus and motivating addressees to adopt the action by presenting a fact as a satellite. In typical cases, speakers motivate addressees to adopt an action by asserting that its precondition is satisfied. In such cases, Motivation occurs together with Circumstance and contributes to the incrementa.1 strategy in the same way as Circumstance.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="305" end_page="305" type="metho">
    <SectionTitle>
4 The Model
</SectionTitle>
    <Paragraph position="0"> As shown in Figure 3, this model is composed of five modules: a problem solver, an utterance planner, an utterance controller, a text-to-speech converter, and a pause monitor. The problem solver makes domain plans that solve a given problem.</Paragraph>
    <Paragraph position="1"> The utterance planner makes utterance plans to propose domain plans. Pragmatic constraints and a context model are used to generate relevant discourses. According to utterance plans, the utterance controller sends linguistic expressions to tile text-to-speech converter. The pause monitor watches the length of pauses and signals the utterance planner and controller when the pause length exceeds a given length.</Paragraph>
    <Paragraph position="2"> These modules work in parallel. Both domain plans and utterance plans are made in a stepwise manner using the hierarchical planning mechanism (Russel and Norvig 1995: Chap.12). This model starts to make an utterance plan before a fully determined domain plan has been obtained.</Paragraph>
    <Paragraph position="3"> When a pause exceeds the time limit, the utterance planner sends the utterance controller an ut-Input: a domain problem Parallel Modules /  domain plan is refined during the planning and articulaton of utterances. Based on a refined domain plan, the utterance plan is replanned. When the utterance controller is not given utterance plans within the time limit, it produces a filler term.</Paragraph>
  </Section>
  <Section position="7" start_page="305" end_page="305" type="metho">
    <SectionTitle>
5 Pragmatic Constraints
</SectionTitle>
    <Paragraph position="0"> Pragmatic constraints are required to guarantee the relewmce of discourses. This model exploits the following pragmatic constraints.</Paragraph>
    <Paragraph position="1">  (cl) Avoid conveying redundant information. ((:2) Pronominalize objects in the focus of attention (Grosz and Sidner 1986).</Paragraph>
    <Paragraph position="2"> (c3) Be relevant according to the attentionM state.  ?\['he context model records the information that has been conveyed and tracks the attentional state. For example, consider the domain action of moving from one location 11 to another 12. To describe such a domain action with verbs such as &amp;quot;iku(go)&amp;quot;, It must be in focus. Otherwise, the description is irrelevant. After such an action has been described, 12 is in the focus. Moreover, any object marked as a topic becomes a focused one.</Paragraph>
  </Section>
  <Section position="8" start_page="305" end_page="306" type="metho">
    <SectionTitle>
6 Problem Solving
</SectionTitle>
    <Paragraph position="0"> We outline the problem solver using a sample problem of how to move from the Musashino Center to the Atsugi Center on the map in Figure 4.</Paragraph>
    <Paragraph position="1"> Tile problem solver first makes an abstract domain plan, which is a sequence of three actions el, a2, and a3 : moving from the Musashino Center to the nearest station by bus, moving to the station nearest the Atsugi Center, and then moving to the Atsugi Center by bus. This plan is written as (rl). The contents of these actions are written as (r2). Expression cont(X, Y) means that the content of X is represented as a set Yof literals.</Paragraph>
    <Paragraph position="3"> The problem solver tries to make a more concrete plan. When more tha,n one domain 1)lan is possible, it chooses tile domain i)lan that requires the shortest execution time. In this domain, the domain plan is a sequence of actions a/t, a5, a6 and aT: moving from the Musashino Center to Kichijoji station by bus, moving to Shimokitazawa station by tile Inokashira IAne, moving to Aiko-ishida station by the Odakyu Line, ~md then moving to the Atsugi Center by bus. Part of the content of this plan is represented as follows.</Paragraph>
    <Paragraph position="5"/>
  </Section>
  <Section position="9" start_page="306" end_page="307" type="metho">
    <SectionTitle>
7 Utterance Planning
</SectionTitle>
    <Paragraph position="0"> An utterance plan is a sequence of colnmnnieative actions that achieves a communicative goal. It is refined in a stepwise manner. A sequence of surface communicative actions corresponding to the uttering of linguistic ext)ressions is finally planned.</Paragraph>
    <Section position="1" start_page="306" end_page="306" type="sub_section">
      <SectionTitle>
7.1 Communicative goals
</SectionTitle>
      <Paragraph position="0"> Generation systems engaging in dialogues must record communicative goals related to communicative actions (Moore and Paris 1994). Communicative goals used here are: * persuaded-plan(P): dialogue partner is pershaded to adopt dommn plan P.</Paragraph>
      <Paragraph position="1"> * persuaded-act(A): dialogue partner is persuaded to adopt domain action A.</Paragraph>
      <Paragraph position="2"> * described-event(E, C, At): domain event E is described as an event having content C an(t attitude At toward E is also described.</Paragraph>
      <Paragraph position="3"> * dc.scribed-obj(O, 6): domain object O is described ~s an object having content C.</Paragraph>
      <Paragraph position="4"> * dcscribcd-them.a-rel(l?~, O, E): thematic relation It is described, which domain object O bears to domain event E.</Paragraph>
      <Paragraph position="5"> When the domain t)lan (rl) is obtained, (r5) is given as the initiM communicative goal.</Paragraph>
      <Paragraph position="6"> (1&amp;quot;5) persuadeA-plan(\[al, a2, a3\])</Paragraph>
    </Section>
    <Section position="2" start_page="306" end_page="306" type="sub_section">
      <SectionTitle>
7.2 Surface coinmunicative actions
</SectionTitle>
      <Paragraph position="0"> Sllrfa(;e commnnicativ(, actions used here are: * sv.rfacc-desc-cvent(E, C, At): utter expressions tO descrit)e, domain event E iLq all event having content C and des(-ribe attitude At toward E.</Paragraph>
      <Paragraph position="1"> * surface-desc-obj(O, C, It): utter expressions to describe doinain object O as an object having content C and bearing thenmtic relation R to a certain event.</Paragraph>
    </Section>
    <Section position="3" start_page="306" end_page="307" type="sub_section">
      <SectionTitle>
7.3 Planning utterances based on tile
</SectionTitle>
      <Paragraph position="0"> fine structure of discourse An utterance pbm is elaborated using action schemata and decomposition methods. An action  consists of an action description, applieal)ility constraints and a plan. It specifies how an action is decolnposed to a detailed phm.</Paragraph>
      <Paragraph position="1"> The following schema (r6) defines the communic;ttive action of proI)osing a domain plan by using Sequence. The decomposition method (rT) specifies how the ~mtion is decomposed to a sequence of finer actions. :~  In these representations, achieve(P) designates an action that achieves goal P. Notation \[H I L\] specifies a list, where H is the head of the list and L is the rest. Symbols starting with &amp;quot;*&amp;quot; represent variables. By applying (r6) and (r7) to the initial communicative goal (rS), the following utterance plan is obtained:  for an action schema.</Paragraph>
      <Paragraph position="2"> aWe have omitted other method to avoid intinite reeursive application of the method (r7).</Paragraph>
      <Paragraph position="3">  (r9) Act(propose-act@A), Effect: persuadcd-act( * A ) ) (rl0) Decomp(proposc-act( * A ), Constr: cont( , A, *C), Plan: achicvc( dcscribcd-cvcnt( , A, ,C, proposal)) (rl 1)Act(describc-cvcnt-by-elaboration(,E, *C, *At), Effect: described-cvcnt( , E, *C, *At)) (r12) Decomp( describc-event-by-elaboration( , E, *C, *At), Constr: * Thema E *CA *Thcma =.. \[*R, *E, *0\] A *R C/ type A cont(*O, ,ObjC) A *Rest = *C - {*Thema} plan: \[ chi e( descr cd-obj( ,O, *ObjC) ), *R, *0, *E) ), ach, ie e( d sc i ed-e e, t( , E, , Re t, , At ) ) \] ) (r13) Act( describe-obj-with-thcma( *O, *C, *R, *E), Effect: dcscribcd-obj( *O, *C)A described-thema-rcl( ,R, *0, *E) ) (r14) Decoinp(dcscribe-obj-with-thcrna(,O, *C, *R, *E), Plan: surface-desc-obj(,O, *C, *R)) (r15) Act( dcscribc-cvcnt-type( ,E, *C, *At), Constr: *C = {type(*E, *T)}, Effect: describcd-cvcnt( , E, *C, *At)) (r16) De~comp( describc-event-type( , E, *C, *At), Plan: surface-desc-event( * E, *C, *At))  actions for proposing a domain action while elaborating the content of the action in a stepwise manner. They reflect the results of a discourse structure analysis, which show that speakers tend to distribute the constituents of a domain action into different IUs by using EI,ABORATION. In (r12), notation F(X, 17,...) =.. \[F, X, Y,...\] is used for decomposing term F(X, Y,...) into relation F and arguments X, Y, ....</Paragraph>
      <Paragraph position="4"> When domain objects are linguistically realized by the surfaee-desc-obj in (r14), pragmatic constraint (c2) is exploited to t)ronominalize focused objects. In addition, according to constraint (c3), the objects that are not in focus need to be topiealized if they must be in focus.</Paragraph>
      <Paragraph position="5"> By applying these schemata to the first action in (r8), the following utterance plan is obtained.</Paragraph>
      <Paragraph position="6"> Thematic relations are chosen in default order when (r12) is applied.</Paragraph>
      <Paragraph position="7"> (r17)surface-desc-obj(xl, {type(xl, building), named(x1, &amp;quot;mnsashino sentaa&amp;quot;)}, source), surface-desc-obj(x2, {type(x2, bus)}, manner), surface-desc-obj(x3, {type(x3, station), nearest(xa, xl)}, dest), surface-desc-event(al, {type(a1, move)}, proposal).</Paragraph>
      <Paragraph position="8"> According to utterance plan (r17), this model can start the following utterances to satisfy the time constraints before Obtaining a concrete domain plan such as (r3).</Paragraph>
      <Paragraph position="9"> (ul)musashino sentaa kara-wa desune/</Paragraph>
    </Section>
  </Section>
  <Section position="10" start_page="307" end_page="307" type="metho">
    <SectionTitle>
PN from-Topic COPULA
</SectionTitle>
    <Paragraph position="0"> (from the Musashino Center) basu de/ mo~ori-no-eki made/ ikimasu/ bus by nearest station to go (by bus) (to the nearest station) (go) For brevity, we have omitted action schemata and decomposition methods for utterance planning using MOTIVATION and CIRCUMSTANCE.</Paragraph>
    <Section position="1" start_page="307" end_page="307" type="sub_section">
      <SectionTitle>
7.4 Replanning utterance plans
</SectionTitle>
      <Paragraph position="0"> While planning and articulating utterances using an abstract domain plan, a more concrete domain plan is being made. When a more concrete domain plan is obtained, an utterance plan is replanned. For example, consider the case where a concrete domain plan, (r3), is obtained during the production of utterance (ul). The following utterance plan is replanned: (r18) surface-desc-obj(xl, {type(x1, building), named(M, &amp;quot;nmsashino sentaa&amp;quot;)}, source), surface-desc-obj (x2,{type(x2, bus)},manner), surface-desc-obj(x7, {type(x7, station), named(x7, &amp;quot;kichijoji&amp;quot;)}, dest), surface-desc-event (a4, {type(a4, move) }, proposal).</Paragraph>
      <Paragraph position="1"> We assume that plan (r18) is obtained when this model finishes uttering &amp;quot;moyori-no-eki made&amp;quot; in utterance (ul). Then (ul) is interrupted and utterances follow based on (r18). Consequently, the following utterances are produced: (u2)musashino sentaa kara-wa desune/</Paragraph>
    </Section>
  </Section>
  <Section position="11" start_page="307" end_page="307" type="metho">
    <SectionTitle>
PN from-Toplc COPULA
</SectionTitle>
    <Paragraph position="0"> (from the Musashino Center) basu de/ moyori-no-eki made/ bus by nearest station to (by bus) (to the nearest station) kichijoji made desune / ikimasu / PN to COPULA go (to Kichijoji station) (go) In the above, the redundant information is not restated according to pragmatic constraint @1). Self-repair occurs: &amp;quot;moyori-no-eki made&amp;quot; is replaced by &amp;quot;kichijoji made&amp;quot;.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML