File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/w94-0325_metho.xml
Size: 20,206 bytes
Last Modified: 2025-10-06 14:14:01
<?xml version="1.0" standalone="yes"?> <Paper uid="W94-0325"> <Title>Subslot Possible Values PRECISION REFERENCE PROMINENCE</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> Situation Viewpoints for Generation </SectionTitle> <Paragraph position="0"> Henry Hamburger s and Dan Tufts 2 ABSTRACT: Representation systems are presented for the input and output of the first or deep phase of a language generation system. Actions and viewpoints are the key factors in determining what sentence is produced; viewpoints provide a wide range of ways tO discuss actions, their states and the plans they compose. The language generator plays a key role in a two-medium conversational system for a naturalistic foreign language learning environment.</Paragraph> <Paragraph position="1"> KEYWORDS: viewpoint, action-based natural language generation, two-medium, conversation Overview After an inlroduction to the nature and role of viewpoints, we motivate this work in terms of our two-medium system for conversational language learning. Since our version of generation is action-based, we then sketch actions. Finally, we ,return to a finer-grained look at viewpoints.</Paragraph> </Section> <Section position="2" start_page="0" end_page="217" type="metho"> <SectionTitle> 1 Viewpoints </SectionTitle> <Paragraph position="0"> In the natural use of natural language, a single event can be talked about in a variety of ways, taking a variety of viewpoints. Such variety is necessary across languages because of differences in how cultures prefer to express things (Delin et al., 1993) and because of differences in how languages make it possible to express things (Felshin, 1993). A sel~tion of viewpoints is also needed within languages, both for coherence (Meteer, forthcoming) and for effective rhetoric (Hovy, 1988).</Paragraph> <Paragraph position="1"> For us, varied viewpoints are a way to expose learners of a foreign language to a v~u'iety of linguistic constructions in the naturalistic, situation-based, two-medium (graphical as well as linguistic) conversations that take place in our foreign language learning environment called Fluent-2.</Paragraph> <Paragraph position="2"> To achieve this objective, we have been developing and implementing our notion of an abstract situation viewpoint, hereafter called simply a view.</Paragraph> <Paragraph position="3"> 1. You picked up the pot.</Paragraph> <Paragraph position="4"> \[description of an action\] 2. The pot is in your hand.</Paragraph> <Paragraph position="5"> \[description of a state\] 3. Now fill the pot.</Paragraph> <Paragraph position="6"> \[command to continue plan\] 4. The water is not on.</Paragraph> <Paragraph position="7"> \[unmet precondition\] 5. What is (still) on the counter? \[question on related object\] 6. I asked you to pick up the cup, not the pot. \[unheeded command\] These examples show differences not only in views but also in the type of conversational interaction: #5 is a question, while #3 and #6 show different aspects of a command-act interaction. Views differ in what actions they refer to, with #1 as the most straightforward case, describing a single action that just occurred. In contrast, #6 refers to two actions, one of which was created earlier in formulating a command that was never performed. Among state views, the most straightforward is to comment on the new value of an object's attribute, as in #2, but it is usually also quite possible to comment on the cessation of the corresponding previous value. Yet another state View is applicable if the new value is the same as the corresponding one for another object; one can then say, for example, that there are two cups on the table or that both cabinet doors are open.</Paragraph> <Paragraph position="8"> A view specifies a way to operate on an action or a possible action in a Situation to produce a language-independent conceptual structure that corresponds to a statement, command or question about an action or its results, purpose, participants, etc. This paper sketches an internal structure for views and indicates their range of expression. The choice of which view to use at a particular point can be made by the tutorial strategist, taking into account the Student's limited knowledge of the language (Hamburger, in press). View processing is the deepest of three levels forming the NLG capability of the learning environment. The general idea of views can be seen from a few examples in three categories: action, state and plan views.</Paragraph> <Paragraph position="9"> 1. Computer Science, George Mason Univ., Fairfax, VA 2. Institute for Informatics, Bucharest, Rumania Underlying sentence #3, above, is a plan view, in this case the notion of transition to the next action in the current plan. Plans can also refer to such things as the completion of a plan or subplan and the transition from one subplan to the next. Plans exist in the microworlds so that the successive actions will make sense, not only those chosen by the tutor to carry out itself, but also those the tutor tells the student to do, as in #3. The resulting situational continuity supports a language beginner by keeping it clear what is being talked about. For a more advanced student, plan views provide their own form of variety, including two-clause sentences like, &quot;Now that the pot is full, put it on the stove,&quot; in which the first clause involves a state view, the second an action view and the whole sentence comes from a plan view, the transition from a just completed subplan to the next action whose goal is not already satisfied.</Paragraph> </Section> <Section position="3" start_page="217" end_page="217" type="metho"> <SectionTitle> 2 Two-Medium Language Learning </SectionTitle> <Paragraph position="0"> Fluent-2 is a two-medium tutorial system whose principal goal is to provide an essential form of foreign language learning experience: realistic conversation in the target language. Figure 1 shows key parts of the system.</Paragraph> <Paragraph position="1"> Language interaction in Fluent-2 is tightly integrated with a visual second medium consisting of partially animated graphics under shared control of the student and the electronic tutor. Both the graphics and the language are the outward manifestations of an underlying microworld of objects, in a hierarchy of classes, taking part in actions that are structured into plans. The graphics and animation provide a realistic auxiliary source of information about what is being said. This independent channel helps the student pick up new vocabulary and language constructions in a clear situational context. This two-medium interaction capability, including the deep generation component sketched in this paper, should also be applicable to tutoring systems in other subject matter.</Paragraph> <Paragraph position="2"> Surface generation is done by a large natural language processing system, developed by Susan Felshin of the MIT Athena Language Learning Project (ALLP) and adapted for us. It is this system that takes semantic structures to syntactic structures and ultimately to sentences of English, Spanish and, to a lesser extent, other languages. The natural language processing, graphics, microworlds and tutorial reasoning are all in MCL2 Common Lisp with CLOS on a Mac-Ilfx with 20MB of main memory.</Paragraph> <Paragraph position="3"> The availability of the two media, along with situational continuity, can provide to adults the kind of redundancy that seems essential to children in their race to fluent use of their native tongue. This is not to say that adults learn in the same way as children. Nevertheless, Fluent-2, has been designed with careful attention to successful second language pedagogy and appropriate second language research. Language generation is especially important at the outset, since the learner must comprehend language before meaningfully producing it.</Paragraph> <Paragraph position="4"> Second language research provides support for using simplified language in meaningful contexts. Three sources of such experience are foreigner talk (by native speakers, to foreigners), motherese (by parents and caregivers, to children) and teacher talk (by teachers, to students). We seek to replicate the benefits of these styles in a computational system, by identifying and adapting specific aspects that underlie their success. Such properties include: restricted vocabulary size; exaggerated intonation and stress; short grammatical sentences; use of concrete references; repetitions, expansions and rephrases; few pro-forms; few contractions; yes-no, choice and tag questions rather than Wh-questions; and so on. (See Hamburger, 1993 for a fuller account).</Paragraph> </Section> <Section position="4" start_page="217" end_page="218" type="metho"> <SectionTitle> 3 Representing Situations and Actions </SectionTitle> <Paragraph position="0"> The actions in a microworld are of special interest because they constitute an input to view processing, our central concern here. Both actions and plans are implemented as parametrized rules with constrained variables, as in the action rule example in Figure 2. Binding the parameters of an action (or plan) to microworld objects yields an instantiated action (or plan), which can then be carried out, with graphical and internal consequences, and/or forwarded to the generation process. Objects are of various types or classes, with individual properties, some inherited, and relationships to each other. Actions and objects, the bridge between the two media, are chosen partly on the basis of having consequences that are clearly realizable in graphics. Actions are organized into flexible, hierarchical plans that support coherent everyday activity</Paragraph> <Section position="1" start_page="218" end_page="218" type="sub_section"> <SectionTitle> 7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994 </SectionTitle> <Paragraph position="0"> Action rules are bi-directional: either the student or the tutor can activate them, depending on the type of interaction. The student does this in a graphically realistic manner, for example by dragging a hand to the faucet and clicking the mouse. The.parameters have scope over the whole rule; binding originates in any slot. Information thus can flow among student, tutor and microworld, supporting the two-way, two-medium conversation.</Paragraph> <Paragraph position="1"> An action rule's Header slot is a key to view processing.</Paragraph> <Paragraph position="2"> It contains the rule name or predicate as well as argumentconstraint pairs, and is used in the straightforward view in Figure 3 for a simplel description of the action. Also useful in view processing is the K-Results slot, containing object-attribute-value triples for updating the internal situation as a :result of the action. State views can select among these results to report various changes.</Paragraph> <Paragraph position="3"> The Goal slot makes it possible, when executing a plan, to skip over any actions and subplans whose goals are already achieved. Besides permitting variety in student action sequences, the satisfied goal can form, via a view, the basis of a useful remark. Views for failed Preconds can also yield comments worth making. Two other action rule slots are for information passed to and from the graphics module; they are not used for views and are omitted from Figure 3..</Paragraph> <Paragraph position="4"> To see the key role of views, suppose that the student has just made something happen and the system's role is now to make a relevant comment. A simple choice is to say what the student just did, using a representation of the student's preceding microworld action, consisting of an operator with its operands. Just such a representation is in the Header slot of the action rule just triggered by the student (via the graphics input slot). It can be transformed to a semantic structure that is an appropriate input for the surface generation module, which can output the resulting sentence. We do exactly that, but not deterministically.</Paragraph> <Paragraph position="5"> Into this action-to-semantics connection, we use views to insert the possibility of a wide-ranging choice of different approaches to constructing something to say.</Paragraph> </Section> </Section> <Section position="5" start_page="218" end_page="219" type="metho"> <SectionTitle> 4 Views, Levels and Instantiations </SectionTitle> <Paragraph position="0"> A view is an abstraction of what to say and how to say it, expressed as a structure. It guides the view processor in selecting parts of an instantiated action, to instantiate the view. The instantiatedi view is a language-independent intermediate representation which ultimately yields an output sentence.</Paragraph> <Paragraph position="1"> The partial example of a view in Figure 3 shows the context level, event level and object level. (Object-level information is not shown.) The event level is central in that it corresponds roughly to the proposition expressed in the main (or only) clause of a sentence. The view type here is 'action', yielding a view that expresses the action itself, without reference:to the plan or the resulting state. 'What-action' can be one that has actually occurred, has been talked about or has been constructed for generation.</Paragraph> <Paragraph position="2"> In Figure 3, this choice depends on the interaction type, which also controls the distinction between commands and declarative sentences and the choice of tense.</Paragraph> <Paragraph position="3"> These observations point to the key role of interaction types within views. Interaction types complement views by organizing the basic conversational move structure.</Paragraph> <Paragraph position="4"> An interaction is a short sequence of specified kinds of linguistic and spatial turns by the tutor and student.</Paragraph> <Paragraph position="5"> Choosing an interaction type determines whether it is the tutor or the student that momentarily takes the initiative. A pedagogically useful interaction type for language learning has at least one linguistic move (is not purely graphical). Either the tutor or the student can start with one of four move types: action, command, question or statement. Following each with its anticipated response yields the eight simplest interaction types.</Paragraph> <Paragraph position="6"> In the Movecaster type, the student can make any possible move, and the tutor then comments; the tutor asks a question in Quizmaster; it gives a command that the student may act on in Commander; and these roles are reversed in Servant. Tourguide is an interaction type with three moves: an action by the tutor, a description of that action, and acknowledgement by the student. Tourguide can provide initial exposure to a new microworld.</Paragraph> <Paragraph position="7"> Variations of it allow the'description to precede or follow the action, or both, giving a basis for variations in tense. The second move in an interaction should be responsive to the first. Thus some kinds of questions call for a sentence in answer, others a phrase or &quot;yes&quot; or &quot;no&quot;. Similarly, actions are expected to be responsive to commands. The tutor may comment about responsiveness to a command or lack of it, using a view-constrained interaction type.</Paragraph> <Paragraph position="8"> It is now easier to see why, in Figure 3, Movecaster is associated with Student-Did, the student's action, whereas Commander calls for an action - Tutor-Thought - not yet carried out by anyone. What-Action takes four possible values: Student-Did or Tutor-Did for the most recent action executed by the student or tutor; and Tutor-Did or</Paragraph> <Section position="1" start_page="219" end_page="219" type="sub_section"> <SectionTitle> 7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994 </SectionTitle> <Paragraph position="0"> Tutor-Thought for an action constructed by the tutor as the basis of something already said or about to be said.</Paragraph> <Paragraph position="1"> State views need two slots at the event-level that are not in event views; see Figure 4. Since an action may result in more than one change inthe values of object attributes, state views have an Aspect to specify how to select one of the changes. The selection method in Figure 4 simply takes the first one in the list of updates - reasonable if results are in order of importance. The Pre-Post slot tells whether to use the updated value or the prior one.</Paragraph> <Paragraph position="2"> Whereas a view tells where to get information, the instantiated view (IV) holds the information itself, which the view processor has for the most part extracted from the instantiated action. For an action view, this is pnncipally the arguments, taken from the action header and placed in IV slots called Agent, Objectl, Object2 and Modifier.</Paragraph> <Paragraph position="3"> Under the guidance of the object level of the view, the view processor associates each argument with the correct slot and puts in the contents. Designed for this purpose is the IV-O, or object level of an IV. Each IV slot can be filled by (i) an IV-O, (ii) a microworld object, (iii) a class, which is a language-independent meaning corresponding to a common noun, (iv) a list of items of the three foregoing kinds, or (v) another IV. The latter yields a subordinate clause, whereas each of the others underlies a noun phrase.</Paragraph> <Paragraph position="4"> Object-level views determine how to express a particular microworld object to convey its relationship to other aspects of a conversation. With a black and a grey cup, for example, after moving the black one, the grey one can be referred to as &quot;the other one,&quot; &quot;the grey cup,&quot; &quot;the second cup&quot; or even &quot;the cup that is still on the table.&quot; In each noun phrase the head noun corresponds by default to the class of the object, unless &quot;one&quot; is included in the specification for that object (giving, in English, the likes of &quot;the red one&quot;). The decision whether to include modifiers (adjectives, relative clauses, and prepositional phrases) may in some cases be expressed by code that includes a method that selects whatever properties are needed to distinguish an entity from others of its class.</Paragraph> <Paragraph position="5"> The object level may also have information that affects decisions about determiners and possibly quantifiers or pronouns. The choice of determiner can not be specified in isolation by the view, since it must take into account the recent mentions of, and actions on, an entity, for example, &quot;Pick up a (indefinite) box&quot; and then, &quot;Good! You picked it (definite pronoun) up.&quot; the object is to be described. It can indicate whether the class for describing the object should be its direct class (e.g., girl, teaspoon), its parent class (e.g., child, spoon) or the highest class permitted by the type constraint for the particular argument of the action rule (e.g., person, thing). Another option is the highest level distinguishing the item from everything else in the current situation. If the item is not alone in the class named, the output needs a modifier or else an indefinite determiner.</Paragraph> <Paragraph position="6"> If the Reference subslot in a view has the value Other, the item is to be described in terms of other items in its class, e.g., &quot;'the other X&quot; or &quot;the rest of the Xs&quot;, as opposed to the default case, a description of an object by its own properties. The Prominence subslot specifies whether its object should be made prominent or not, and if so, whether by topicalization - Topic - or by being questioned with a Wh word.</Paragraph> <Paragraph position="7"> Acknowledgement. This work is supported under grant IRI-9020711 from the National Science Foundation.</Paragraph> </Section> </Section> class="xml-element"></Paper>