File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-2114_metho.xml

Size: 18,630 bytes

Last Modified: 2025-10-06 14:12:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="C88-2114">
  <Title>FRAMEWORK FOR A MODEL OF DIALOGUE</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
FRAMEWORK FOR A MODEL OF DIALOGUE
Ronan REILLY
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
I INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> In this paper we present a general model of cmmnunication applied to the special case of dialogue. Our broad perspective aims to account for the many facets of human dialogue within a singl~ theoretical framework. In particular, our project's aim of incorporating relevant non-verbal communicative acts from the person-machine interface make it essential that the description of communication be sufficiently broad.</Paragraph>
    <Paragraph position="1"> The model described here takes as its starting point the communicative utterance or act. It considers the higher-order structures into which communicative acts may be incorporated, but does not detail their internal composition. It is in this sense that the model provides a framewerk for the formal treatment of dialogue.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="540" type="metho">
    <SectionTitle>
2 COMPONENTS OF THE MODEL
</SectionTitle>
    <Paragraph position="0"> A full description of the adopted dialogue model has been given in Egan, Ferrari, Harper, et al. (1987).</Paragraph>
    <Paragraph position="1"> It relies on a double deserlption of dialogue: a syntactic analysis of dialogue structure and a semantic-pragmatic description of the communication context. The basic units are:  - meaningful expression (ME): Any physical act carrying a non-contextual meaning; - communicative act (CAct): An instance of ME issued by a specific &amp;quot;issuer&amp;quot; and received by a specific &amp;quot;receiver&amp;quot;; - communicative situation (CS): The CAct together with all the relevant facets.</Paragraph>
    <Paragraph position="2"> - communicative situation structure (CSS): A larger  aggregation of &amp;quot;CSs that provide a bridge into the intentional component of the dialogue model. Each of these components is discussed in more detail below.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Communicative Acts and Dialogue Structure
</SectionTitle>
      <Paragraph position="0"> The syntactic component of the dialogue model relies on the fact that, if we examine a dialogue or any other communicative exchange, it is possible to observe in the sequence of communicative acts, sub-sequences which follow regular patterns. These patterns can be catalogued in a form which expresses their significant regularities. This approach leads to a descriptive method very similar to the formal description of language in terms of a vocabulary of terminal symbols (the communicative acts), a vocabulary of auxiliary symbols (a collection of labels), and a set of productions (discourse patterns). Within the definition of a communicative act, provision is made for gestural information accompanying an utterance, such as a deictic gesture involving a mouse or some other pointing device (in the context of person-machine interaction).</Paragraph>
      <Paragraph position="1"> The idea of treating discourse segments like phrases in a sentence is not new (cf. Burton, 1981).</Paragraph>
      <Paragraph position="2"> However, the nature of the entities involved is rarely fully clarified. In Christie, Egan, Ferrari, et al. (1985), a dialogue classification system was presented, based on the system of classification of Burton (1981), It consisted of a set of functional labels divided into the following five hierarchical levels, from lower to higher, acts: {marker, summons, elicitation, reply .... }, moves: (delineating, sketching .... }, exchanges: (explicit, boundary, conversational,....}, transactions: (exchange,...} interactions: (transaction .... } The labels at the act level are defined in terms of functional labels assigned to expressions, such as &amp;quot;starter&amp;quot;, structurally realized by a statement, a question, or a command; &amp;quot;informative&amp;quot; structurally realized by a statement; &amp;quot;elicltation&amp;quot;, structurally realized by a question. These, together with their functional definitions, represent a closed set of elements. The labels at a higher level are all defined in terms of patterns of labels of the immediately lower level. This set of rules may be regarded as the set of productions, which generates communications. In this way, a dialogue/ communication is adequately described in terms of a formal generative grammar. An ATN-like grammar of dialogue in these terms has been described in Egan, Forrest, Gardiner et al. (1986) and Reilly (in press).</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="540" type="sub_section">
      <SectionTitle>
2.2 ~ommunicative Situations
</SectionTitle>
      <Paragraph position="0"> The semantic-pragmatlc description relies on the notion of &amp;quot;communicative situation&amp;quot; (CS). A CS is a way of representing the communicative exchange together with its context. It consists of facets, which are aspects of the CS that occur with a certain regularlty in all CSs of a given sort. Facets may be formally conceived of as &amp;quot;sorted regularities&amp;quot; in the scene where communication takes place, therefore a CS may be described as CS w {fs' ft .... } where the subscripts identify the sort of the facet. It is relatively easy to identify the sort of the more frequent regularities, such as who the issuer is (fi) , who the receiver is (f), etc., and to consider these as constituent elements of a CS, around which other facets become, from time to time, relevant.</Paragraph>
      <Paragraph position="1">  Situation Semantics has been shown (ef. Egan, Ferrari, Harper, et al, 1987) to have some advantages for the representation and the treatment of a CS, provided that certain modifications and extensions to the original description of a discourse situation are carried out. In communication, since more than one, and often more than two, participants are involved, each with different attunements to the CS and different perceptions of what in Situation Semantics is called the speaker's connections, more than one classification of the same CS is possible. In the best case, where participants understand a CS in the same way, communication is successful, otherwise some failure occurs. In general we can assume, that participants in a communicative event are able first to classify, and then understand situations on the basis of the situation types they share. In the spirit of SS, we assume ~lat these CS-types are the description of regularities observed in actual eo~tunicatiens. An important consequence is that a new notion, relevance or relevant ~-~M, is established in terms of the more frequently observed regularities.</Paragraph>
      <Paragraph position="2"> We can, then, describe the facets of the conununicative situation in terms of properties of that situation, where the notion of relevance intervenes at two levels. At the first level, the set of properties is not defined a priori. Different properties are relevant to the interpretation of different utterances in different situations. Some of these arc involved more frequently than others in the process of understanding, and may be considered ,tore fundamental than others to a CS. These seem to be the roles of issuer, receiver, location, colm~unication mode, illocution. The communication mode, i.e. , whether co~mtnication happens face-toface, by telephone, or in any other way, may affect both the form of the message and the referring expressions. By illoeution, the traditional illocutioeary force is meant, although a more fine~ grained classification of speech acts is intended (cf. Christie, Egan, Ferrari, et el, 1985). Also, other facets of a CS may occasionally become relevant to the understanding of an utterance.</Paragraph>
      <Paragraph position="3"> At the second level, each property of a CS is taken to be a role participating in an intersecting set of regularities which qualify its sort. Thus, the property: \[xl&lt;&lt;l , saying, x, ~&gt;,l&gt;\] describes some indeterminate x saying ~, and participating in those situations where it is &amp;quot;regular&amp;quot; (nomie) that some x says ~.</Paragraph>
      <Paragraph position="4"> By further specification we can assume that \[a-touristl&lt;&lt;l , saying, a-tourist, e&gt;,l&gt;\] participates in those situations in which x is of type a-tourlst. In Barwise and Perry's (\].985) notation this would be given as: \[x I In S: a-tourist, x, yes\] where S is the set of situation-types in which a tourist is involved.</Paragraph>
      <Paragraph position="5"> Both properties and types classify real objects that become lelevant to a discourse situation in accordance with the relations participants are attuned to. On the basis of this notion of relevance, it is possible to define a large set of types of properties which may or may not appear in one or the other CS. A receiver makes use of these classificatory devices to classify and understand any speeifle CS with which he or she is presented.</Paragraph>
    </Section>
    <Section position="3" start_page="540" end_page="540" type="sub_section">
      <SectionTitle>
2.3 Cotmmunicative Situation Structures
</SectionTitle>
      <Paragraph position="0"> The Conmtunicative Situation Structure (CSS) is equivalent in level of analysis to the discourse segment of the Grosz and Sidner (1986) model. The three components of the CSS (see Figure I) are the conm~nicative act component (CAct), the communicative situation component (CS), and certain properties specific to the CSS itself. A CSS can consist of a number of CSs, and these in turn can consist of a number of CAets. The nature of CActs and CSs has already been discussed above.</Paragraph>
      <Paragraph position="1"> A number of factors serve to distinguish one communicative situation from another. These can involve any change in the context of the dialogue; for example, a change in location or a change of speaker, in the ease of person-mach~le communication it is most likely to involve a change in the speaker or a change in some aspect of the computer's visual display.</Paragraph>
      <Paragraph position="2"> A number of eo~nunicative situations go to make up a CSS. What distinguishes one CSS from another is a change in the purpose of the CSS. The CSS is also the repository of information about what entities in the dialogue arc currently in focus. Thi.s information is used in the reso\].ution of anaphora.</Paragraph>
    </Section>
    <Section position="4" start_page="540" end_page="540" type="sub_section">
      <SectionTitle>
2.4 Structural Relationsh~p~
</SectionTitle>
      <Paragraph position="0"> ,A CSS can be related to another CSS in a limited way.</Paragraph>
      <Paragraph position="1"> The relationship can only be hierarchical, and it represents a route through which information relating to the focus of attention can be transmitted. If tile focus of attention is on one CSS, definite noun phrases and anaphora in general can be resolved either from entities in focus within the current CSS or from the focus space of a CSS that is connected to the current one.</Paragraph>
      <Paragraph position="2"> Figure 2 represents a structured collection of CSSs.</Paragraph>
      <Paragraph position="3"> As can be seen, they consist of a number of tree fragments, rather than one large tree. Such a situation can occur if the purpose of a dialogue is to achieve a number of distinct goals, which cannot be integrated under a dominating CSS.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="540" end_page="542" type="metho">
    <SectionTitle>
3 PRAGMATIC DIMENSIONS
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="540" end_page="541" type="sub_section">
      <SectionTitle>
3.1 Attentional State
</SectionTitle>
      <Paragraph position="0"> The disembodied arrow in Figure 2 represents the current focus of attention. The focus of attention sets bounds on what are valid targets for anaphorie reference within a CSS. This focus shifts automatically as a new CSS is created. It can also be shifted by one or other of the dialogue participants explicitly requesting a shift of focus back to a previous topic in the dialogue. However, there is a constraint put on this shift. When moving from one tree fragment to another, the focus of iattention can only shift to the top-most node of the  target tree. From there, it may traverse the subordinate nodes of the tree to locate the apprdpriate CSS. This restriction reflects the fact that when a dialogue participant returns to a previously active topic in the dialogue, he or she tends to proceed from the general to the specific aspect of that topic. Traversal of the CSS tree from top to bottom represents such a transition.</Paragraph>
      <Paragraph position="1"> The component of the model operated upon by the attentional mechanism is the focus space. This consists of a list of items that we call discourse objects. Tile entities on the list can either have properties in their own right, or can Inberit them f\]:'om higher up in a classification hierarchy. The reason for having highly structured objects in the foe~.~s space, is to allow for the resolution of anaphoric rcferenee.~ of the following type (after S~dner, 1.9/9): A: if: saw John's Irish Wolfhound yesterday B: Yes. They're really big dogs.</Paragraph>
      <Paragraph position="2"> !i~ (J',) the phrase ljh_eJ_~ge does ;lot refer back to any ~pecifio entity mentioned in (A), but rather to t;hc eJ::J.ss of dogs of whieh John's is a member. In order sL.toeessfu\].ly to resolve this reference, knowledge ~&lt;:eds to be available to the resolution process concerning the class of entities to which the speeific irish Wolfhound mentioned belongs. The way this is achieved in tile model described here, is to a\].\[low the entities in the focus space to inherit properties via a classification hierarchy.</Paragraph>
    </Section>
    <Section position="2" start_page="541" end_page="542" type="sub_section">
      <SectionTitle>
3.2 Intentional Structure
</SectionTitle>
      <Paragraph position="0"> As has been pointed out in the description of the dialogue structures, the topmost element of the structural hierarchy (the CSS) contains a pointer into a structure representing the purpose of the CSS.</Paragraph>
      <Paragraph position="1"> Crosz and Sidner refer the set of such CSS purpose.'.~ as the intentional structure of the dialogue. In essence the CSS purposes arc elements in the plan underlying the dialogue. In the case of a person-machine dialogue system, they are the actions that tile user wishes the system to perform. There are t~;o relationships that can hold between elements of the intentional structure and these are dominance and satisfaetion-@rpcedenee. These represent goal/subgoal and pre-eondltion relationships, respectively. The hierarchy of intentional elements is more or less isomorphic to the dialogue structure, as can be seen in Figure 3. Here, tlle dialogue st1:ucture i.PS ~:epz'esented by white boxes and the underlying intentional structure by shaded boxes. Also note that the intentional structure may be expanded by an inferential process, without there being a col:responding node in the dialogue structure.</Paragraph>
      <Paragraph position="2"> The specif_ie details of the intentional structure is dependent on the dialogue domain, unlike tbe dialogue  structure. In the following example of an application of the model, the domain is that of database interaction with the user performing the specific task of tabulating data about students' a~es and courses. Each intentional component represents an action of tabulation, and the place that the action has in the intentional hierarchy is determined by the complexity of the table requested (or inferred).</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="542" end_page="542" type="metho">
    <SectionTitle>
4 A SAMPLE APPLICATION
</SectionTitle>
    <Paragraph position="0"> The following dialogue (except $8) was collected as part of a corpus of simulated person~maehine dialogues collected for the studies described in Egan~ Harris, Harper, and Reilly (1986). $8 is inserted to illustrate how an inferred intention can he used by the system to direct the dialogue.</Paragraph>
    <Paragraph position="1"> I\]i: How marly students, both male and female, under 16 or younger in the year degree course'/ $2: Tbere are no students of that age group in the Cell.ego.</Paragraph>
    <Paragraph position="2"> U3: Again in the 3 year degree course, how many male and female students in the following age groups: 19 20 21 22 23 25 or older $4: Here is the table.</Paragraph>
    <Paragraph position="3"> US: Total\[ number of both male and female students in this course of study $6: 153 males and 559 fema\].es.</Paragraph>
    <Paragraph position="4"> U7: Please supply a breakdown of both male and female students in the graduate course.</Paragraph>
    <Paragraph position="5"> $8: Do you wish to see a complete sex by age by course breakdown? Figure 4 illustrates the unfolding of both the dialogue and intentional structures (the numbers in tile boxes correspond to utterances). The intentional structure underlying $8 is inferred on the basis that the user has asked for the same breakdown for two courses, therefore he or she may wish to have a three-way breakdown for all courses. This inference then gives rise to utterance $8, which is incorporated into the dialogue structure. The left of Figure 4 represents the state of the dialogue and intentional structures up to and including utterance</Paragraph>
  </Section>
  <Section position="6" start_page="542" end_page="542" type="metho">
    <SectionTitle>
, Dialogue
</SectionTitle>
    <Paragraph position="0"> U7. The right: of the figure represents the structure:; after $8.</Paragraph>
    <Paragraph position="1"> In U5, the reference to all unspecified course (underlined) requires that a referent be found. The bi-direetional links in the discourse structure allow information from the focus spaces of the connected nodes to be accessed in the resolution process.</Paragraph>
    <Paragraph position="2"> Thus, the anaphoric reference in U5 can be reso\].ved by accessing the focus space of utterances 3 antl 4. Note that: the small disembodied arrows in Figure 4 indleate t:he current attentional state of the dialogue.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML