XML Viewer - j81-1001

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/81/j81-1001_metho.xml
Size: 68,865 bytes
Last Modified: 2025-10-06 14:11:21
<?xml version="1.0" standalone="yes"?>
<Paper uid="J81-1001">
  <Title>Determining Verb Phrase Referents in Dialogs I</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
SRI International
</SectionTitle>
    <Paragraph position="0"> This paper discusses two problems central to the interpretation of utterances: determining the relationship between actions described in an utterance and events in the world, and inferring the &amp;quot;state of the world&amp;quot; from utterances. Knowledge of the language, knowledge about the general subject being discussed, and knowledge about the current situation are all necessary for this. The problem of determining an action referred to by a verb phrase is analogous to the problem of determining the object referred to by a noun phrase.</Paragraph>
    <Paragraph position="1"> This paper presents an approach to the problem of determining verb phrase referents in which knowledge about language, the subject area, and the dialog itself is combined to interpret such references. Presented and discussed are the kinds of knowledge necessary for interpreting references to actions, as well as algorithms for using that knowledge in interpreting dialog utterances about ongoing tasks and for drawing inferences about the task situation that are based on a given interpretation.</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> This paper discusses two problems central to the interpretation of utterances: determining the relationship between actions described in an utterance and events in the world, and inferring the current worldstate from utterances. Knowledge of the language, knowledge about the general subject area, and knowledge about the current situation are all necessary for this. The problem of determining an action referred to by a verb phrase is analogous to the problem of determining the object referred to by a noun phrase. Although considerable attention has been given to the latter (Donellan, 1977; Grosz, 1977a, 1977b; Sidner, 1979; Webber, 1978), little has been done with the former. 2 The need to identify an action is obvious in utterances containing verbs like &amp;quot;do&amp;quot;, &amp;quot;have&amp;quot;, and &amp;quot;use&amp;quot;, as in &amp;quot;I've done it&amp;quot;, &amp;quot;what tool should I use?&amp;quot;, or &amp;quot;I 1 This research has been funded under three-year NSF Continuing Research Grant No. MCS76-22004. This paper and the research reported in it have benefited from interactions with all the members of the natural language research group at SRI. Barbara Grosz, Jerry Hobbs, Gary Hendrix, and Jane Robinson have been particularly helpful in the preparation of the paper.</Paragraph>
    <Paragraph position="1"> 2 A problem related to determining verb phrase referents -interpreting verb phrase ellipsis -- has been investigated by Webber (1978).</Paragraph>
    <Paragraph position="2"> have it&amp;quot;. In these utterances the verb does not name the action, but rather refers to it more generally, much as pronouns or &amp;quot;nonspecific&amp;quot; nouns (e.g., &amp;quot;thing&amp;quot;) refer to objects. Even when more specific verbs are used, complex reasoning may be required to ascertain the particular action being referred to. For example, the utterance &amp;quot;I've glued the pieces together&amp;quot; can refer to different steps in a task -- depending on what objects &amp;quot;the pieces&amp;quot; refers to, because each gluing action is a different step. Similarly, the verb &amp;quot;cut&amp;quot; refers to different types of cutting actions when used with different objects, as in &amp;quot;cut grass&amp;quot;, &amp;quot;cut wood&amp;quot;, or &amp;quot;cut cake&amp;quot; (Searle, 1978).</Paragraph>
    <Paragraph position="3"> A variant of this problem is deciding whether a verb is intended to refer to a general or a specific action. For example, &amp;quot;cutting wood&amp;quot; can refer to the general activity of cutting many pieces of wood or it can refer to the action of cutting a particular piece.</Paragraph>
    <Paragraph position="4"> (Werner, 1966) This paper presents an approach to these problems in which knowledge about language, the subject area, and the dialog itself is combined to interpret references by verbs. Presented and discussed are the kinds of knowledge necessary for interpreting references to actions, as well as algorithms for using that knowledge in interpreting dialog utterances about ongoing tasks Copyright 1981 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the Journal reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission. 0362-613 X/81/010001-16501.00 American Journal of Computational Linguistics, Volume 7, Number 1, January-March 1981 1 Ann E. Robinson Determining Verb Phrase Referents in Dialogs and for drawing inferences about the task situation that are based on a given interpretation. The algorithms have been implemented and tested in a computer system (TDUS) that participates in a dialog about the assembly of an air compressor (Robinson et al., 1980).</Paragraph>
    <Paragraph position="5"> The system acts as an expert, guiding an apprentice through the steps of the task. The knowledge available will be described first, followed by a detailed description of the algorithms for verb interpretation, then by a discussion of a sample dialog in which the system participated.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. Knowledge Needed
</SectionTitle>
    <Paragraph position="0"> Interpreting any utterance and relating it to a task requires knowledge about the language and the task, as well as the relationships between them. This paper will concentrate on knowledge needed to identify actions. It builds directly on the concepts of global and immediate focusing, through which certain entities are highlighted (Grosz, 1977a, 1977b, 1978; Sidner, 1979). General familiarity with that research will be assumed. More detailed descriptions of other aspects of the knowledge needed for interpreting utterances can be found elsewhere (Grosz, 1977a; Hendrix, 1977, 1979; Robinson et al., 1980; J. Robinson, 1980).</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Actions and Events
</SectionTitle>
      <Paragraph position="0"> Interpreting verb phrases requires knowing about events that have occurred, are occurring, or can occur.</Paragraph>
      <Paragraph position="1"> Such knowledge typically includes the steps necessary to perform the actions associated with the events, the possible participants, the conditions that must be true before the actions can be performed, and their effects.</Paragraph>
      <Paragraph position="2"> Knowledge about actions and events includes both general knowledge about possible actions and events and more specific knowledge about those that occur during a particular task.</Paragraph>
      <Paragraph position="3"> We have developed a formalism, process models, for encoding information about actions (Grosz et al., 1977). This formalism enables the specification of a hierarchical decomposition of actions into subactions, as well as the description of individual types of actions. It is an extension of the network formalism used for representing other knowledge about objects and relationships, as described by Hendrix (1979).</Paragraph>
      <Paragraph position="4"> The description of each action type includes information about its participating actors and objects, the preconditions for its enactment, its effects, and the alternative sequences of substeps that may be follow e d to accomplish it. A sequence of substeps may be partially ordered. This decomposition of actions builds upon earlier research on the hierarchical decomposition of the planning process (Sacerdoti, 1977) and upon the work by Hendrix (1973, 1975) on modeling actions and processes. Many of the actions for a pump-assembly task have been encoded in this formalism for use in the TDUS system.</Paragraph>
      <Paragraph position="5"> Figure 1 illustrates a process model for a pump-attaching process. The network node ATTACH PUMP represents the set of pump-attaching actions.</Paragraph>
      <Paragraph position="6"> The large box depicts a separate space in the network in which the schema of the ATTACH PUMP action is represented. The DELIN arc links the schema to the ATTACH PUMP node. The schema specifies the participants in the attach operation, marked by the MAJORPART, MINORPART, and AGENT arcs.</Paragraph>
      <Paragraph position="7"> The description of the action, an element of the set of EVENT DESCRIPTIONS, includes the PRECONDI-TIONS that must be true for the action to be performed, the EFFECTS of performing the action, and the PLOT or steps by which the action is performed.</Paragraph>
      <Paragraph position="8"> Each step in the plot (encoded on a separate space) is in turn further described by a process model. In this example, the substeps of attaching are positioning and bolting the pump. Their ordering is indicated by the SUC (successor) link. The plot steps have many of the same participants as the main action. In addition the second plot step, &amp;quot;secure with bolts&amp;quot;, introduces another set of participants, BOLTS, indicated by the FASTENER arc.</Paragraph>
      <Paragraph position="9"> During a task, a record of progress is kept by filling in, or instantiating, the schema for an action as that action is performed and then incorporating the newly created piece of network into the model of the current situation. Records of actions are linked both temporally by a time lattice and through their taxonomic relationships with other events and objects in the task. Each instantiated action has associated with it a time interval. The interval can be past, present, or future, and it can be bounded by two times: a start time and an end time. For events treated as points, the start and end times are identical. For events whose start and/or end time is not precisely known, the values may be left unspecified or represented by parameters that are bounded above and/or below by known points in the time lattice.</Paragraph>
      <Paragraph position="10"> Once an instance of an event is recorded, it can be used in subsequent deductions and is available for answering questions about past events. This provides a means of maintaining an up-to-date record of assembly progress. Such a record comprises an essential part of the domain context within which utterances are interpreted and questions answered. At any given moment the domain context indicates what assembly actions have already occurred (and in what order), what actions are in progress, and what actions can be initiated next.</Paragraph>
      <Paragraph position="11"> We have developed procedures for reasoning about process models. These procedures build upon those that embody general knowledge about logical deduction (Fikes and Hendrix, 1977). These new proce-</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 American Journal of Computational Linguistics, Volume 7, Number 1, January-March 1981
</SectionTitle>
    <Paragraph position="0"> Ann E. Robinson Determining Verb Phrase Referents in Dialogs a goal is current or achieved, and how goals are represented. 4 In the following sections, we will see how these goals are used for interpreting verbs.</Paragraph>
    <Paragraph position="1">  The TDUS system handles two kinds of goals: domain goals and certain knowledge-state goals. Domain goals concern states to be achieved by task-related actions, while knowledge-state goals concern states to be achieved by acquiring a specific piece of information. null Figure 2 illustrates the relationship between actions and goals. The hierarchy shown is a simplification of a portion of the assembly task hierarchy currently encoded in TDUS. 5 Each node represents an action and its associated goal. The hierarchy encodes the substep relationships: child nodes represent substeps of their parent nodes. The top-level node in the tree, node 1, represents the action of attaching a pump whose associated goal is that the pump be attached.</Paragraph>
    <Paragraph position="2"> Nodes 2 and 3 represent substeps of this attaching process -- the actions of positioning the pump and tightening the bolts, with the associated goals that the pump be positioned and that the bolts be tight. The action of locating bolts represented by node 4 is not an explicit step in the task, but is necessary for its performance. Node 4 has an associated knowledge-state goal: &amp;quot;know the location of the bolts&amp;quot;. All these goals have associated actions that, in the process model formalism, are specific instantiations of actions, not action schemata.</Paragraph>
    <Paragraph position="3"> We distinguish two classes of goals: direct goals achieved by actions the apprentice has explicitly or implicitly said are being performed now or have been performed and potential goals mentioned by either participant that have not been acted upon but might possibly be. Both domain and knowledge-state goals can be either direct or potential although the current implementation of TDUS does not support potential knowledge-state goals.</Paragraph>
    <Paragraph position="4"> In the context of the task steps shown in Figure 2, &amp;quot;I am attaching the pump&amp;quot; states that an attaching action (node 1), is being performed. This establishes the direct domain goal that the pump be attached.</Paragraph>
    <Paragraph position="5"> &amp;quot;Should I tighten the bolts?&amp;quot; indicates that the tightening action (node 3) might be performed, establishing the potential domain goal that the bolts be tight.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 The current implementation of goals in TDUS is an exten-
</SectionTitle>
    <Paragraph position="0"> sion and partial revision of one by Sidner described in her dissertation (1979).</Paragraph>
    <Paragraph position="1"> 5 Although the assembly task currently encoded in TDUS involves strong structuring of actions and goals, the representations and procedures we have developed are applicable to less structured subject areas.</Paragraph>
    <Paragraph position="2">  A direct knowledge-state goal can be established, for example, by the utterance &amp;quot;where are the bolts?&amp;quot;, which establishes the knowledge-state goal &amp;quot;know the location of the bolts&amp;quot; (node 4). A potential knowledge-state goal would be established by an utterance such as &amp;quot;I'd like to read more Plato&amp;quot; which implies the potential knowledge-state goal of knowing more about the philosophy of Plato.</Paragraph>
    <Paragraph position="3"> Direct and potential goals are distinguished from one another because of the different roles they play in the interpretation of verbs. Basically, direct goals are those that are known to be current or former goals associated with actions that are being or have been performed. Potential goals are possible near-term goals associated with possible future actions. Depending on the type of utterance, one or the other class of goal might be considered first. The different roles of the two goal classes will be illustrated when the interpretation of verbs is discussed in detail in Section 3. In the TDUS system, a potential goal can be introduced either by the apprentice who is performing the task or by the system which is acting as an expert advisor. These goals can be introduced in at least three different ways.</Paragraph>
    <Paragraph position="4"> (1) The apprentice can introduce a potential goal by mentioning a possible future action, while not explicitly stating that it will be performed. This distinguishes between &amp;quot;I am going to take the lid off now&amp;quot; and &amp;quot;should I take the lid off now?&amp;quot; The former expresses a direct goal because the speaker explicitly says s/he is planning to perform the action. The latter expresses a potential goal because the speaker has not made a commitment to performing the action, but implies that s/he might. When a potential action is mentioned in this way, if it is an appropriate next step</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 American Journal of Computational Linguistics, Volume 7, Number 1, January-March 1981
</SectionTitle>
    <Paragraph position="0"> Ann E. Robinson Determining Verb Phrase Referents in Dialogs in the task the system will establish the associated goal as a potential goal. For example, &amp;quot;Should I tighten the bolts now?&amp;quot; will cause the system to establish the potential goal &amp;quot;that the bolts be tight&amp;quot; if the appropriate reply is &amp;quot;yes&amp;quot;.</Paragraph>
    <Paragraph position="1"> (2) The expert can introduce a potential goal by telling the apprentice what actions to perform. The goal is potential and not direct, because the expert cannot, on the basis of the utterance alone, assume that the apprentice will perform the action. For example, the expert's reply to &amp;quot;What should I do now?&amp;quot; will cause establishment of the potential goal -- or goals if there are multiple possibilities -- associated with the action in the reply.</Paragraph>
    <Paragraph position="2"> (3) The apprentice can also introduce a potential goal by indirectly mentioning an action in the task. For example, if the apprentice says &amp;quot;I found the pulley.&amp;quot; in a situation in which one of the next steps is to install the pulley, but neither the installation nor the pulley has been mentioned before, the potential goal &amp;quot;that the pulley be installed&amp;quot; will be inferred from the reference to the pulley and the knowledge that it is a possible next step. This forward reference to an object implicitly focuses the object and the step it is associated with. Previously, algorithms for shifting focus caused a shift to the step associated with the object (Grosz, 1977b). However, this is problematic because the speaker may not intend to perform the step or even discuss it, but rather intends to talk about the object. Establishing the step in which the object participates as a potential goal highlights the step but does not force a shift of focus to it. This change has proved to be important, as will be seen during discussion of the algorithm.</Paragraph>
    <Paragraph position="3"> Utterances can introduce direct and potential goals simultaneously. In the examples in items 1 and 2 above, direct knowledge-state goals are also being introduced. In particular, the knowledge-state goals are &amp;quot;knowing whether tightening the bolts is the next step&amp;quot; and &amp;quot;knowing the action to perform&amp;quot;.  As important as recognizing a goal, is recognizing whether the goal is the current one, one that has already been achieved, or one that has been abandoned. Recognizing when goals are no longer potential is also important.</Paragraph>
    <Paragraph position="4"> m direct goal is assumed to be current when an utterance states that an action that will achieve the goal is in progress.</Paragraph>
    <Paragraph position="5"> achieved either A goal is assumed to have been (1) when an explicit statement such as &amp;quot;I have attached it&amp;quot; or &amp;quot;I'm done&amp;quot; or &amp;quot;OK ''6 indicates the completion of the action achieving the goal; (2) when an explicit statement indicates an action intended to achieve the goal is finished; or (3) when the start of a new action implies completion of the current one and thus achievement of the associated goal. 7 A goal is assumed to have been abandoned following an utterance such as &amp;quot;never mind&amp;quot;.</Paragraph>
    <Paragraph position="6"> Potential goals cannot be achieved as such. Rather, they can either become direct goals through the mechanisms for establishing direct goals or they disappear when a new potential goal is recognized.  The structure of goals in a dialog about a task is related both to the structure of the task and to the structure of the dialog. The structure of tasks and the structure of dialogs have been discussed elsewhere (Grimes, 1980; Grosz, 1977a, 1977b, 1978; Hobbs, 1978; Reichman, 1978; Sacerdoti, 1977; Sidner, 1979; Wilensky, 1978). Open questions remain about the structure of the goals that arise and how they should be represented.</Paragraph>
    <Paragraph position="7"> In TDUS direct goals are represented in a single list, acting like a last-in-first-out stack. Both knowledge-state and domain goals are entered on the same list. This simplification has proved adequate for current purposes.</Paragraph>
    <Paragraph position="8"> In general, there can be only one potential goal at a time. The exception is when two possible actions are introduced at once, as in &amp;quot;install the aftercooler or install the brace&amp;quot;. Because it is simplest to view a potential goal as a single item, hereafter references to the potential goal can be read as referring to the possible conjunction or disjunction of potential goals when appropriate.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Knowledge about Language
</SectionTitle>
      <Paragraph position="0"> To interpret verbs and infer the current task and dialog situation, the knowledge outlined above must be combined with knowledge about the language including what is generally characterized as syntactic, semantic, and discourse knowledge.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
6 See the discussion in Grosz (1977a) of the roles of OK.
7 As Sidner (1979) points out, in the first two cases the
</SectionTitle>
    <Paragraph position="0"> information comes from the utterance, while in the third case it is from the task model.</Paragraph>
    <Paragraph position="1"> American Journal of Computational Linguistics, Volume 7, Number 1, January-March 1981 5 Ann E. Robinson Determining Verb Phrase Referents in Dialogs  One of the most important elements of syntactic knowledge necessary for interpreting verbs -- and the one discussed here -- is knowledge about tense and aspect. Tense and aspect are used to indicate the relative time of an event and whether it is or was occurring or completed.</Paragraph>
    <Paragraph position="2"> Tense and aspect are indicated syntactically by auxiliaries and/or certain verb forms. In TDUS, utterances are analyzed and marked for tense (past, present, or future) and for progressive (event in progress) and perfective (event completed) aspect.</Paragraph>
    <Paragraph position="3"> The following are some examples of the verb forms TDUS can interpret along with their tense and aspect markings: I am going.</Paragraph>
    <Paragraph position="4"> I had gone.</Paragraph>
    <Paragraph position="5"> I had been going.</Paragraph>
    <Paragraph position="6"> I will be going.</Paragraph>
    <Paragraph position="7"> present, progressive past, perfective past, perfective, progressive future, progressive In determining referents, the tense and aspect of the utterance restrict the alternatives within the task model and limit the goals that might be considered. Generally, present tense and progressive aspect are used when referring to a new action, indicating that it has been started. Only if the utterance is somehow marked, as in &amp;quot;I'm still tightening the bolts&amp;quot;, will the verb phrase refer to an action that already has been mentioned as in progress. Similarly, past tense and/or perfect aspect indicate that an action has been finished. However, the hearer may or may not have known that the action was in progress.</Paragraph>
    <Paragraph position="8"> So far, we have considered primarily verbs that refer to events rather than states, and to the usage that is most common in dialogs about tasks, such as references to single occurrences of actions. However, the analysis and representation are compatible with analyses that consider other kinds of usage (Leech, 1976).</Paragraph>
    <Paragraph position="9">  The interpretation of references to actions and events requires knowledge of the relationship between words for actions or events and the internal representations of the corresponding classes of actions or events; s it also requires knowledge of the relationship between nouns and entities in the domain. For example, the &amp;quot;SELLING&amp;quot; action is an action whose participants include a buyer, seller, some object being sold, and some money. Semantic knowledge about selling would include the information that for an utterance 8 Note that at the beginning of a dialog only the relationships between words and classes of concepts is known. The problem addressed here is how to identify the particular action or event referenced in a particular utterance.</Paragraph>
    <Paragraph position="10"> whose main verb is &amp;quot;sell&amp;quot; in the active voice, the syntactic subject is the &amp;quot;seller&amp;quot; in a selling event, the syntactic object is the item sold, the indirect object is the one to whom the item is sold, and the object of the &amp;quot;for&amp;quot; preposition is the selling price. The information necessary to make this mapping and to build the appropriate representation is encoded with the verb. (Hendrix in Walker, 1978; Konolige, 1979).</Paragraph>
    <Paragraph position="11">  Discourse knowledge is knowledge about how the domain and dialog contexts in which an utterance occurs contribute to and are influenced by the interpretation of the utterance. Although we have included it here under knowledge about language, discourse knowledge may be viewed as spanning knowledge about language and about the domain.</Paragraph>
    <Paragraph position="12"> 2.3.3.1 Focusing During a dialog, the participants focus their attention on only a small portion of what each of them knows or believes. Both what is said and how it is interpreted depend on a shared understanding of this narrowing of attention to a small highlighted portion of what is known.</Paragraph>
    <Paragraph position="13"> Focusing is an active process. As a dialog progresses, the participants continually shift their focus and thus form an evolving context within which utterances are produced and interpreted. A speaker provides a hearer with clues of what to look at and how to look at it -- what to focus on, how to focus on it, and how wide or narrow the focusing should be. We have developed a representation for discourse focusing (or global focusing), procedures for using it in identifying objects referred to by noun phrases, and procedures for detecting and representing shifts in focusing (Grosz, 1977a, 1977b, 1978, 1980).</Paragraph>
    <Paragraph position="14"> Focused objects are highlighted in the network model by placing them in separate &amp;quot;focus spaces&amp;quot;. Several focused objects may appear in one space. Focus spaces are arranged in a hierarchy that reflects the degree of focusing. The most prominent space is considered primary focus. As focusing shifts, the hierarchy is changed accordingly and new spaces may be created for the newly highlighted objects, while old ones may disappear.</Paragraph>
    <Paragraph position="15"> In addition to global focusing, we have incorporated the concept of immediate focus (Sidner, 1979) through which one entity among those focused is singled out. This is a more localized focusing phenomenon that is closely related to the use and recognition of anaphora, as well as to changes in global focusing. The notion of focusing has been used elsewhere and is related to notions such as topic, comment, given, and new. Each of these reflects an attempt to</Paragraph>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="metho">
    <SectionTitle>
6 American Journal of Computational Linguistics. Volume 7, Number 1. January-March 1981
</SectionTitle>
    <Paragraph position="0"> Ann E. Robinson Determining Verb Phrase Referents in Dialogs identify the roles of certain sentential elements within a discourse. See Sidner (1979) for a discussion of the relationship between focus and these other concepts.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3.3.2 Common-background and Communicated
Knowledge
</SectionTitle>
      <Paragraph position="0"> In our framework, the dialog participants are assumed to share knowledge about processes in the task model 9 and the history of the task performed to date, along with knowledge about direct and potential goals and focused entities. We view this shared knowledge as composed of at least two parts: (1) common-background knowledge -- knowledge about the world that is assumed to be shared by the participants independently of the dialog, based on their common background and experience, such as the processes in the task model and the history of its performance; (2) communicated knowledge -- knowledge about the goals and focusing, which is assumed to be shared as a result of the dialog. The steps of the task that are explicitly mentioned are communicated knowledge, as are other focused entities that have been mentioned. We will distinguish these two types of shared knowledge and their roles in the interpretation of utterances.</Paragraph>
      <Paragraph position="1"> We distinguish as communicated knowledge essentially what Clark and Marshall (1980) distinguish as the mutual knowledge that results from &amp;quot;linguistic co-presence.&amp;quot; Our use of the term common-background knowledge covers the mutual knowledge they describe as resulting from &amp;quot;cultural co-presence&amp;quot; and a limited form of &amp;quot;physical co-presence&amp;quot;.</Paragraph>
      <Paragraph position="2"> To help clarify our distinction between common-background and communicated knowledge, consider a dialog about assembling a pump. The dialog participants share knowledge about actions used in assembly (inserting objects, tightening bolts), about parts (nuts, bolts, washers), about tools, and about terminology for talking about them. All this is common at the beginning of the dialog. During the dialog additional knowledge is communicated. Consider the following exchange between an expert (E) and an apprentice (A): E: First, put the bolts in the holes.</Paragraph>
      <Paragraph position="3"> A: How many and what size? E: 4 bolts, each 3/4&amp;quot;.</Paragraph>
      <Paragraph position="4"> A: OK.</Paragraph>
      <Paragraph position="5"> A: They're in.</Paragraph>
      <Paragraph position="6"> Common-background knowledge here includes knowing about aligning holes and inserting bolts. Following 9 Note that the apprentice knows neither all the steps in the task nor their ordering -- otherwise there would be no need for the expert. However, the apprentice does know how to perform most of the basic actions, such as bolting and tightening.</Paragraph>
      <Paragraph position="7"> the expert's first utterance it has become communicated knowledge that the first step is to put the bolts in the holes and that doing so is a potential goal of the apprentice. The expert's second utterance communicates the fact that 4 bolts should be used. The apprentice's response then adds to communicated knowledge the fact that the action has taken place.</Paragraph>
      <Paragraph position="8"> The fact that the holes were aligned and the proper bolts found can be assumed by the expert, drawing on knowledge of the task. Since these actions were not mentioned, they are part of common-background knowledge but not communicated.</Paragraph>
      <Paragraph position="9"> Assumptions about things that are communicated knowledge play a critical role in the interpretation and production of utterances (Clark and Marshall, 1980), as the use of anaphora illustrates. Pronouns and proverbs (when used felicitously) always refer to concepts in communicated knowledge, so that any utterance containing a pronoun or pro-verb must draw upon communicated knowledge. In the example above, if the apprentice's second utterance had been &amp;quot;I'm putting them in now&amp;quot; followed by &amp;quot;I've done it&amp;quot;, the &amp;quot;it&amp;quot; could have referred only to the insertion step, which has been communicated, not to any substep which has not been.</Paragraph>
      <Paragraph position="10"> A similar observation about the use of anaphora has been made by Hankamer and Sag (1976). They differentiate the linguistic and nonlinguistic components of communicated knowledge, using the term &amp;quot;pragmatic environment&amp;quot; to refer to the nonlinguistic environment -- which is limited in our situation since there is no shared visual information. Hankamer and Sag state that &amp;quot;the conditions on insertion (and interpretation) are that the speaker presumes the content of the anaphor to be recoverable, either from linguistic context (in which case the anaphor has an 'antecedent' in linguistic structure, a fully specified linguistic form with the same semantic content) or from the pragmatic environment.&amp;quot; (Pg. 422). The algorithms we have developed for interpreting verbs draw on these observations and distinguish between utterances containing and not containing anaphora, relying more heavily on communicated knowledge when anaphora is present.</Paragraph>
      <Paragraph position="11"> Entities that form part of communicated knowledge can be referred to anaphorically, but they are not always, as is demonstrated by the use of definite noun phrases to refer to focused objects. In the foregoing example, the bolts are focused and are thus part of communicated knowledge after the expert's first utterance -- but when the expert refers to them the second time, a noun phrase is used instead of a pronoun. The degree of focusing, which influences the choice of anaphora or a definite noun phrase to refer to some entity in communicated knowledge, has been discussed elsewhere (Sidner, 1979; Grosz, 1977b; Reichman, 1978).</Paragraph>
      <Paragraph position="12"> American Journal of Computational Linguistics, Volume 7, Number 1, January-March 1981 7 Ann E. Robinson Determining Verb Phrase Referents in Dialogs When referring to something not assumed to be communicated knowledge, a speaker not only cannot use anaphora, but must draw on other shared knowledge and supply enough information to enable the hearer to interpret the reference correctly. In our example, if the apprentice had asked where to find the bolts, the expert could have said &amp;quot;in the cabinet&amp;quot;, assuming the apprentice was generally familiar with the surroundings and knew where the cabinet was.</Paragraph>
      <Paragraph position="13"> The expert could not have said &amp;quot;in it&amp;quot; unless the cabi- null net had already been mentioned and comprised a highly focused part of communicated knowledge.</Paragraph>
      <Paragraph position="14"> 3. Determining Verb Phrase Referents  In this section we address issues that arise in applying domain and linguistic knowledge to interpret verb phrases and to infer the current situation on the basis of the interpretation. Many of the examples in this section are taken from the sample dialog in Section 4.</Paragraph>
      <Paragraph position="15"> The possible referents of a verb phrase are constrained by both the context and the utterance itself. Coordination of the constraints is necessary for interpreting verbs. Contextual constraints are derived from two sources: the dialog and the subject area, particularly the task being performed. Utterance constraints are derived from the syntax and semantics, particularly tense and aspect information and the type of action denoted by the verb.</Paragraph>
      <Paragraph position="16"> The search for the referent of a verb phrase can be conducted either top-down or bottom-up. The top-down search uses contextual constraints to find the place in the task that the utterance fits and it uses utterance constraints to limit alternatives. The bottom-up mode uses information from the utterance, such as verb type, to find its relationship to the task. If the top-down search is successful, the action and its place in the task are identified simultaneously.</Paragraph>
      <Paragraph position="17"> For the assembly dialogs in which all the utterances are directly related to the task and in which the system has already encoded all the relevant steps to be performed, top-down constraints are strong enough to allow a top-down search to be conducted first -- and only if that fails is a bottom-up search conducted. In dialogs where less structure is provided by the task, a bottom-up search will clearly play a more central role.</Paragraph>
      <Paragraph position="18"> This search can be improved by more extensive reasoning based on the verb in the utterance.</Paragraph>
      <Paragraph position="19"> One of the limitations of our previous natural-language systems has been a lack of coordination of the strategies for identifying referents of noun phrases and pronouns with one another or with the interpretation of the verb. In fact, except for the pronoun resolution procedure that used a very simple goal recognition algorithm (Sidner, 1979), the verb phrase was not even taken into account. However, since the interpreration of each of these utterance elements cannot be carried out in isolation, the procedures for identifying noun phrase and pronoun referents described in Grosz (1977a) and Sidner (1979) have been modified to coordinate the search for noun phrase and anaphoric referents with the search for the verb phrase referent.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 The Top-down Algorithm
</SectionTitle>
      <Paragraph position="0"> Different types of utterances can draw upon different contextual constraints. Three major factors are considered by the interpretation algorithm in determining which contextual constraints to draw upon. The factors are: (1) whether or not a pronoun is present in the utterance; (2) whether or not all the noun phrases in the utterance refer to focused entities; and (3) whether or not the main verb is &amp;quot;do&amp;quot;. For the first factor, the presence of a pronoun indicates that communicated knowledge, particularly goals and immediate focus, is being drawn upon. If no pronoun is present, these factors may still be relevant but other factors weigh more heavily in determining constraints.</Paragraph>
      <Paragraph position="1"> For the second factor, when all the definite noun phrases refer to focused entities, focusing information is also key in interpreting the verb. If not all the referents are focused, knowledge about the task and its structure must be used. For the third factor, when &amp;quot;do&amp;quot; appears as the main verb, communicated knowledge plays a more central role than when other verbs are used. The particular usage of &amp;quot;do&amp;quot;, as signalled by the other constituents, indicates which aspects of communicated knowledge are most important.</Paragraph>
      <Paragraph position="2"> We will discuss the interpretation algorithm by examining the interpretation of utterances resulting from various combinations of these factors. The utterances we will discuss are those containing the verb &amp;quot;do&amp;quot;, those containing verbs other than &amp;quot;do&amp;quot; and pronouns, and those containing verbs other than &amp;quot;do&amp;quot; and definite noun phrases.</Paragraph>
      <Paragraph position="3"> Within the first type of utterances, those containing &amp;quot;do&amp;quot;, we further distinguish utterances like &amp;quot;I've done it&amp;quot; from utterances like &amp;quot;I've done the screws.&amp;quot; In the former, &amp;quot;do&amp;quot; refers to the general action of performing an action and &amp;quot;it&amp;quot; refers to the action. In the latter, &amp;quot;do&amp;quot; refers to a particular action, such as &amp;quot;remove&amp;quot;. Our discussion will first cover these two types of utterances containing &amp;quot;do&amp;quot;, then utterances with other verbs and pronouns, then utterances with other verbs and definite noun phrases.</Paragraph>
      <Paragraph position="4">  In interpreting verb phrases such as &amp;quot;do it&amp;quot;, knowledge about the context is used first to determine possible referents. If &amp;quot;it&amp;quot; has been used felicitously, it must refer to an action in communicated knowledge.</Paragraph>
      <Paragraph position="5"> As we have discussed, communicated knowledge in TDUS is represented by goals and focusing. Goals are</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="0" end_page="0" type="metho">
    <SectionTitle>
8 American Journal of Computational Linguistics, Volume 7, Number 1, January-March 1981
</SectionTitle>
    <Paragraph position="0"> Ann E. Robinson Determining Verb Phrase Referents in Dialogs a subset of all focused entities and, by definition, those actions that could possibly be performed by the apprentice. Consequently, possible referents are contained in the subset of communicated knowledge represented by the most current direct goals and by the potential goal.</Paragraph>
    <Paragraph position="1"> The main utterance constraints are derived from the tense and aspect, which limit the goals whose associated actions could be referents. The three cases we distinguish are past tense, present tense and progressive aspect, and future tense.</Paragraph>
    <Paragraph position="2"> Past-tense utterances can refer to either direct or potential goals. For such utterances, the algorithm examines the most recent direct goal first. If it is associated with a task-related action (i.e., not a knowledge-state goal), the action is taken to be the referent of &amp;quot;it&amp;quot; because it is the action known to be in progress. Utterance 10 from the sample dialog illustrates such a reference to a task goal.</Paragraph>
    <Paragraph position="3"> A: I'm doing the brace now. (9) E: OK A: I've done it. (10) Here &amp;quot;it&amp;quot; refers to the action of installing the brace, the action associated with the current goal.</Paragraph>
    <Paragraph position="4"> Because of current implementation restrictions, the most recent direct goal is not considered as a referent if it is a knowledge-state goal. Instead, the action associated with the potential goal is taken to be the one referred to since it is always an action of the task.10 Clearly, if potential goals were extended to include knowledge-state goals, a more sophisticated test would be required.</Paragraph>
    <Paragraph position="5"> Utterances 12 through 15 from the sample dialog illustrate reference to a potential goal.</Paragraph>
    <Paragraph position="6"> A: What should I do now (12) E: Install the aftercooler elbow on the pump.</Paragraph>
    <Paragraph position="7">  The apprentice's utterance 12 establishes a direct knowledge-state goal of knowing what action to perform, while the expert's reply establishes a potential goal that the aftercooler elbow be installed. Utterance 10 This is a limitation that should be removed as linguistic and representational capabilities improve. An example of &amp;quot;it&amp;quot; referring to a knowledge-state goal would be &amp;quot;I wanted to learn Spanish and I've done it&amp;quot;, where the goal was a knowledge-state goal of 'KNOWING SPANISH'.</Paragraph>
    <Paragraph position="8"> 13 refers to the potential goal. Utterance 14 similarly establishes a direct knowledge-state goal of knowing about the action -- in this case, whether the action is installing the aftercooler; here the apprentice's utterance establishes the potential goal that the aftercooler be installed. Utterance 15 refers again to the potential goal.</Paragraph>
    <Paragraph position="9"> An utterance that is present-tense and progressive (e.g., &amp;quot;I'm doing it&amp;quot;) refers to an action that has been previously mentioned but only just started. As we have seen, a potential goal is associated with such an action, so that the latter is taken as the referent. For example, utterance 15 could have been &amp;quot;I'm doing it&amp;quot;, referring to the action of installing the aftercooler. For a question referring to a future or a hypothetical action (e.g., &amp;quot;What should I do now?&amp;quot;), no attempt is made to identify the action as part of the interpretation. Instead, the reasoning process makes use of the task model to identify the appropriate reply.</Paragraph>
    <Paragraph position="10">  For the use of &amp;quot;do&amp;quot; in which &amp;quot;do&amp;quot; refers to an action (e.g., &amp;quot;I'm doing the screws&amp;quot;), the hearer must be able to infer the action from the context. One case of this is when the action type is part of communicated knowledge but no specific action is being referred to. For example in the sequence: I've attached the pump.</Paragraph>
    <Paragraph position="11"> I'm doing the pulley now.</Paragraph>
    <Paragraph position="12"> the first utterance adds the attaching action for the pump to communicated knowledge. In the second utterance, &amp;quot;do&amp;quot; refers to another attaching action, but this one is attaching the pulley, a separate action.</Paragraph>
    <Paragraph position="13"> &amp;quot;Do&amp;quot; is not referring to the same specific action, but rather to the same type of action, &amp;quot;attaching&amp;quot;. There are other occurrences of &amp;quot;do&amp;quot; in which the action is implicit from the context and the action type has not been mentioned. The algorithm currently only handles the situation in which the action type has been mentioned.</Paragraph>
    <Paragraph position="14"> To interpret these utterances, the contextual knowledge used is communicated knowledge and knowledge about the task. The communicated knowledge used is focusing information, because an action of the same type as the one referred to should be focused, it The interpretation algorithm searches among focused actions to find one that is of a type capable of having the newly mentioned participating objects. For example, the algorithm might find &amp;quot;attach pump&amp;quot; as a focused action, determine that it is an &amp;quot;attach&amp;quot; and then It Goal information could be used by examining the types of the actions associated with domain goals. However, access to the action type is more direct through focusing information. American Journal of Computational Linguistics, Volume 7, Number 1, January-March 1981 9 Ann E. Robinson Determining Verb Phrase Referents in Dialogs that a pulley can also participate in an &amp;quot;attach&amp;quot; action. If an action is found, task knowledge is used to determine if an action of that type with the participants indicated is an appropriate action in the current situation. Thus, if attach + pulley is an appropriate action, &amp;quot;attach pulley&amp;quot; is taken as the referent of &amp;quot;do&amp;quot;. Tense and aspect information from the utterance help determine which actions in the task model are appropriate. As we noted, a present-progressive utterance indicates initiation of a new step, whereas the past tense could be used either with a new step or with one in progress.</Paragraph>
    <Paragraph position="15"> Utterances 8 and 9 of the sample dialog illustrate a related situation.</Paragraph>
    <Paragraph position="16"> A: Should I install the pulley now (8) E: No. The next step is: install the aftercooler elbow on the pump, or install the brace on the pump.</Paragraph>
    <Paragraph position="17"> A: I'm doing the brace now (9) Here two steps have been mentioned and are essentially equally focused and both potential goals, so &amp;quot;do it&amp;quot; could not refer unambiguously to one of the actions.</Paragraph>
    <Paragraph position="18"> However, both actions are &amp;quot;install&amp;quot; actions, so &amp;quot;do&amp;quot; can refer to an &amp;quot;install&amp;quot; type action. The interpretation algorithm outlined above works for this case as well.</Paragraph>
    <Paragraph position="19">  For utterances containing verbs other than &amp;quot;do&amp;quot; and pronouns, contextual constraints also stem from communicated knowledge, since the object or objects referred to by the pronoun must be communicated knowledge -- in our case, mentioned in the dialog.</Paragraph>
    <Paragraph position="20"> The way the referent of the pronoun was introduced into the dialog affects the interpretation of utterances with pronouns. The distinction we make is whether the object was mentioned as a participant in an action that is part of the task, (e.g., &amp;quot;I attached the pump.&amp;quot;) or was not mentioned as a participant in an action (e.g., &amp;quot;Where is the pump?&amp;quot;). In the first case, if the object has been mentioned as participating in an action, the action will be recognized as a direct or potential goal and all its participating objects will be focused. In the second case, if no action has been mentioned but the object is a participant in some task action, the action will be inferred through the potential-goal recognition mechanism and will become a potential goal. However, in this case only the object mentioned will be focused and not the other participants in the action. An example of the second case is: Where are the bolts? \[Immediate focus = bolts\] \[Potential goal = THE BOLTS ARE BOLTED\] I've tightened them with the wrench.</Paragraph>
    <Paragraph position="21"> \[with the wrench not in focus\] In this situation, the first reference to the bolts has established the potential goal that the bolts be bolted. In both these situations the object mentioned is focused and, when appropriate, an action it participates in is established as a goal. The difference between the two is whether the actions and the other participating objects are also focused. This difference affects the interpretation of successive utterances containing pronouns.</Paragraph>
    <Paragraph position="22"> Three cases are distinguished in the algorithm: (1) If there is a pronoun and there are no definite noun phrases, the actions associated with the most recent direct goal and the potential goals are considered as possible referents of the verb, since either of the two cases described above could obtain. (2) If there are definite noun phrases, all of which refer to focused entities, then the actions associated with the most recent direct goal and the potential goal are the most likely referents. Since all the objects are focused, the action was presumably mentioned as in the first case described above. (3) If there is a pronoun and there are also definite noun phrases, but not all the definite noun phrases refer to focused entities, then only an action associated with a potential goal is a possible referent. Since a direct goal associated with this object could not have been established, only the second case described above could obtain.</Paragraph>
    <Paragraph position="23"> In all three cases, utterance information about tense and aspect and about action type (from the verb) is used either to verify that the action associated with the goal is a possible referent or to choose a matching action type among possible referents.</Paragraph>
    <Paragraph position="24">  When there is no anaphora in the utterance, the contextual knowledge used for interpretation comes from focusing and the task model. Focusing is used to determine the relationship between the utterance and focused entities, including the current action. The task model, including the record of task progress, is used to determine which actions can reasonably be talked about in the current context. First, focusing information is used to determine if the referents of any definite noun phrases associated with the verb are  currently focused.</Paragraph>
    <Paragraph position="25"> 10 American Journal of Computational Linguistics, Volume7, Number 1, January-March 1981 Ann E. Robinson Determining Verb Phrase Referents in Dialogs 3.1.4.1 All Noun Phrases in Current Focus  The presence of all noun phrase referents in focus indicates that the action involves objects currently being discussed by discourse participants and that the action is related to the current step (because it involves the same objects). The task model provides information about actions the apprentice can perform and has performed. Tense and aspect information from the utterance and the verb type restrict alternatives within the task model.</Paragraph>
    <Paragraph position="26"> Since present and progressive utterances generally refer to newly started actions, the actions considered in the task model are those that are closely related to the most recent action performed and that involve objects referred to in the utterance. Possible actions might be: a substep of the last step started but not completed; the potential goal; or a step not involving any different objects that is closely linked in the plan to the last step started or completed (i.e., a step that is a substep of or successor to the last step, or succeeds a parent of the last step).</Paragraph>
    <Paragraph position="27"> Utterance 1 in the sample dialog (&amp;quot;I am attaching the pump&amp;quot;) illustrates a present-progressive utterance with a noun phrase referring to a focused object. In this instance, the pump-attaching step is a substep of the last step started -- installing the pump.</Paragraph>
    <Paragraph position="28"> For utterances that are past tense and/or perfective aspect, actions in the task model known to have been in progress and those that could be next steps are possible referents. The alternatives considered during interpretation are: a step in progress; the potential goal; a substep of the last step started; a substep of any step in progress; and a step closely linked to the last step started or completed. Utterance 7 (&amp;quot;I attached the pump&amp;quot;) shows a reference to a completed action that was a step in progress -- attaching the pump. The verb in utterance 11 (&amp;quot;I've installed the pulley&amp;quot;) refers to a completed action which was the next step to perform, but was not explicitly mentioned as having been started -- installing the pulley.</Paragraph>
    <Paragraph position="29"> 3.1.4.2 Not all Noun Phrases in Current Focus If the referents of the noun phrases are not currently focused, the focusing hierarchy is searched because the hierarchy indicates previously focused objects that might become focused again. If the noun phrase referents are identified somewhere in the focusing hierarchy, the action named in the utterance is matched against any action occurring at that place in the hierarchy.</Paragraph>
    <Paragraph position="30"> If the utterance contains noun phrases referring to objects participating in the action and those objects cannot be identified among focused entities, the actions associated with direct goals are eliminated as possible referents of the verb. This happens because all actions associated with direct goals have been mentioned, which has caused all their participants to be focused.</Paragraph>
    <Paragraph position="31"> Possible referents of such verb phrases include: the action associated with the potential goal; a substep of the current step in progress; a substep of all the steps in progress (if the utterance is past and/or perfective); or any action which can achieve some current goal (e.g., knowing a location -&gt; found the object).</Paragraph>
    <Paragraph position="32"> Since the objects described in the noun phrases and the action both have to be tested when examining the substeps, the algorithm first checks the objects described by the noun phrases to see if they are participants in any of the substeps and if so, it then examines the actions to ascertain whether one of them matches the input action.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Bottom-Up Search
</SectionTitle>
      <Paragraph position="0"> Currently the bottom-up algorithm consists of a search for the most specific occurrence of an event in the model whose participants are compatible with those in the utterance. This strategy is being expanded to include a search for a more general event that can then be found in the task. This can be either the most specific event type that is compatible with all the elements in the utterance, or a more general or 'similar' event type that is compatible and can be found in the task. An example of the first is an utterance containing &amp;quot;tighten the bolt&amp;quot;. The verb &amp;quot;tighten&amp;quot; refers to a general tightening action, that can have more specific uses -- such as tighten screws, tighten bolts, etc. From the knowledge that one kind of tightening is bolt tightening and from the occurrence of &amp;quot;bolts&amp;quot; in the utterance, it can be inferred that the &amp;quot;tighten bolts&amp;quot; action is intended. In the second case, a more specific verb might have been used (e.g., bolt the pump) to mean securing the bolts.</Paragraph>
      <Paragraph position="1"> The verb &amp;quot;bolt&amp;quot; might be initially interpreted as referring to a specific action of tightening bolts. However, the task model may not have &amp;quot;tighten bolts&amp;quot; encoded as an explicit step. Instead, perhaps it is implicit in some more general securing step. From the bolting action and knowledge of the more general actions of which it is a subset (e.g., securing), its relation to the task model can be found.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Setting Limits to a Search
</SectionTitle>
      <Paragraph position="0"> Knowing when to stop searching for a referent of a verb phrase is another important part of interpreting it. In general, the extent to which a verb phrase reference is interpreted depends on the type of utterance.</Paragraph>
      <Paragraph position="1"> For example, a verb phrase may refer to an action that does not fit into the current task context, such as one that could not or should not be performed at that time. If the verb phrase is contained in a question (e.g., &amp;quot;Should I cut the end off now?&amp;quot;), a reasonable American Journal of Computational Linguistics, Volume 7, Number 1, January-March 1981 11 Ann E. Robinson Determining Verb Phrase Referents in Dialogs assumption may be as follows: if the action cannot be identified it is not the appropriate one to take, as illustrated in Utterance 8. On the other hand, if the verb phrase is contained in a statement (e.g., &amp;quot;I have cut off the end.&amp;quot;), identifying the specific action performed is more important, since a model of the current situation could not otherwise be maintained. Thus, any process for identifying a verb phrase referent should be able to determine the amount of resources it should expend in each situation.</Paragraph>
      <Paragraph position="2"> Another factor to be considered is the extent to which the speaker can be assumed to be cooperative, and, consequently his or her utterances to be relevant.</Paragraph>
      <Paragraph position="3"> If some fairly direct connection between the utterance, the task, and/or dialog context can be postulated, devoting more effort to the search for a connection is more reasonable than in a less task-oriented dialog, in which such a connection may not even exist. In the TDUS system it is assumed that the user is cooperative and that all of his or her utterances are relevant.</Paragraph>
      <Paragraph position="4"> Thus, considerable effort is expended when necessary to relate a statement about an executed step to the task of which it is a part.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.4 Effect of Automatic Planning
</SectionTitle>
      <Paragraph position="0"> The strategy described here has been developed in a system in which the plan for accomplishing the task has already been determined. The incorporation of an automatic planning facility should not require substantial modification. With automatic planning, the search forward to next possible steps would generally require planning &amp;quot;next steps&amp;quot; to see if the action in the utterance would fit, and bottom-up searching could include plan recognition to see how the action might be part of a plan.</Paragraph>
    </Section>
  </Section>
  <Section position="10" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4. Sample Dialog
</SectionTitle>
    <Paragraph position="0"> This section presents a sample dialog in which the TDUS system was one of the participants. This dialog illustrates some utterances that can be interpreted and responded to, the goals that are inferred, and the inferences that are drawn about the task. The apprentice's utterances are preceded by the symbol &amp;quot;#&amp;quot; and numbered for purposes of discussion. The rest of the dialog was generated by the system acting as an expert.</Paragraph>
    <Paragraph position="1"> In the initial context for this dialog, the next step to be performed is to install the pump. The first step in installing the pump is the pump-attaching step illustrated in Figure 1. At the outset, the table (T1), the pump (PU), the apprentice (you) and the compressor (COMP) are in &amp;quot;primary focus&amp;quot;.</Paragraph>
    <Paragraph position="2">  Knowing the wrench to use.</Paragraph>
    <Paragraph position="3"> In utterance 1, the apprentice indicates the start of the pump-attaching action. This is assumed to be the current goal-step, as the commentary indicates. The utterance also causes focusing to shift to the pump (PU) and the platform (PL) with the pump (PU) as the expected immediate focus, t2 The hierarchy of focused entities (Grosz, 1977a, 1977b) is illustrated in the two levels of focusing shown here. &amp;quot;Primary focus&amp;quot; indicates the most highly focused entities, &amp;quot;then&amp;quot; indicates the next level of the hierarchy containing the other objects T1, PU, You, and COMP. Because the pump is explicitly mentioned in utterance 1, it appears  Utterance 2 is a question about a substep of the attaching action. The goal is interpreted as a knowledge-state goal -- knowing what wrench to use.</Paragraph>
    <Paragraph position="4"> This goal is added to the stack of direct goals as the most recent goal.</Paragraph>
  </Section>
  <Section position="11" start_page="0" end_page="0" type="metho">
    <SectionTitle>
#WHERE ARE THE BOLTS (3)
</SectionTitle>
    <Paragraph position="0"> Bolting the pump to the platform with the bolts.</Paragraph>
    <Paragraph position="1"> Utterance 3 is another question about a substep, in this case the location of the bolts used for bolting the pump. The direct goal is a knowledge-state goal, to know the location of the bolts; it is placed on the stack atop the goal from utterance 2., The potential goal, a domain goal, is that the bolts be bolted; this is the goal associated with the bolting substep in which the bolts are used. It replaces the previous potential goal.</Paragraph>
    <Paragraph position="2"> Utterance 4 shows satisfaction of the goal of knowing the location of the bolts, which is removed from the stack of direct goals.</Paragraph>
  </Section>
  <Section position="12" start_page="0" end_page="0" type="metho">
    <SectionTitle>
#WHERE IS THE WRENCH
</SectionTitle>
    <Paragraph position="0"> The box-end wrench is on the table.</Paragraph>
    <Paragraph position="1"> Focus has shifted to:  Bolting the pump to the platform with the bolts.</Paragraph>
    <Paragraph position="2"> In utterance 5 the apprentice asks about the loca-tion of &amp;quot;the wrench&amp;quot;. This utterance illustrates how focusing information helps disambiguate noun phrase referents. There are several wrenches in the model, so the phrase &amp;quot;the wrench&amp;quot; might be considered ambiguous. However, in utterance 2 a particular wrench was focused by the expert's reply and has remained focused, so the phrase &amp;quot;the wrench&amp;quot; can be interpreted as referring to a unique wrench -- the particular box-end wrench previously mentioned and identified. The goal inferred from utterance 5 is &amp;quot;knowing the loca-tion of the wrench.&amp;quot; In both this utterance and utterance 2, TDUS has apparently satisfied the apprentice's knowledge-state goal by supplying the relevant information, but TDUS does not assume that the knowledge-state goal will be satisfied unless the apprentice confirms it. This is a design decision that could be changed by assuming the American Journal of Computational Linguistics, Volume 7, Number 1, January-March 1981 13 Ann E. Robinson Determining Verb Phrase Referents in Dialogs reply satisfied the goal or by distinguishing the goal as one that has been potentially-satisfied. Different choices reflect different assumptions about the other participant. In one case, it is assumed that the coparticipant understands, whereas in the other case, such understanding is not assumed, but must be explicitly confirmed.</Paragraph>
    <Paragraph position="3"> In utterance 7 the apprentice explicitly indicates the completion of the attaching step, from which the system infers that the substeps shown in Figure 1 have been performed.</Paragraph>
  </Section>
  <Section position="13" start_page="0" end_page="0" type="metho">
    <SectionTitle>
#I ATTACHED THE PUMP (7)
OK
</SectionTitle>
    <Paragraph position="0"> The following has been assumed: You put the pump at the platform.</Paragraph>
    <Paragraph position="1"> You bolted the pump to the platform with the bolts.</Paragraph>
  </Section>
  <Section position="14" start_page="0" end_page="0" type="metho">
    <SectionTitle>
#I'M DOING THE BRACE NOW (9)
OK
</SectionTitle>
    <Paragraph position="0"> The following has been assumed: DO is assumed to mean: You are installing the brace.</Paragraph>
    <Paragraph position="1"> Focus has shifted to:</Paragraph>
  </Section>
  <Section position="15" start_page="0" end_page="0" type="metho">
    <SectionTitle>
OK
</SectionTitle>
    <Paragraph position="0"> The following has been assumed: You attached the installed brace to the pump.</Paragraph>
    <Paragraph position="1"> You put the installed brace at the pump. You secured the installed brace.</Paragraph>
    <Paragraph position="2"> DO is assumed to mean: You installed the brace.</Paragraph>
    <Paragraph position="3"> Immediate focus: BR - a brace.</Paragraph>
    <Paragraph position="4"> Completed goal-step was: Installing the brace on the pump. The question in utterance 8 is about a possible next step -- in this case the wrong one. The algorithm for identifying the action indicated by the verb phrase recognizes that the step mentioned is not a possible next step; the apprentice is told this and, in addition, is told the possible next steps.</Paragraph>
    <Paragraph position="5"> Utterance 9 shows the verb &amp;quot;do&amp;quot; used in this case to mean &amp;quot;install&amp;quot; -- referring back to the expert's reply, in which &amp;quot;install&amp;quot; was used. In utterance 10 &amp;quot;do&amp;quot; is used differently -- as &amp;quot;perform&amp;quot; with &amp;quot;it&amp;quot; referring to the brace-installing action, which is inferred to have been completed. TDUS also infers completion of the substeps of installing the brace, putting it on the pump, and securing it. Utterances 11 through 15 illustrate other references to steps started and/or completed and the corresponding inferences, focusing shifts, and goal changes. 14 American Journal of Computational Linguistics, Volume 7, Number 1, January-March 1981 Ann E. Robinson Determining Verb Phrase Referents in Dialogs</Paragraph>
  </Section>
  <Section position="16" start_page="0" end_page="0" type="metho">
    <SectionTitle>
#I'VE INSTALLED THE PULLEY (11)
OK
</SectionTitle>
    <Paragraph position="0"> The following has been assumed: You installed the woodruff key on the pump.</Paragraph>
    <Paragraph position="1"> You attached the woodruff key to the pump.</Paragraph>
    <Paragraph position="2"> You put the woodruff key at the pump. You secured the woodruff key. You attached the installed pulley to the pump.</Paragraph>
    <Paragraph position="3"> You put the installed pulley at the pump. You fastened the installed pulley to the pump with the screws.</Paragraph>
    <Paragraph position="4"> Focus has shifted to:</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML