File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/p92-1016_metho.xml

Size: 24,046 bytes

Last Modified: 2025-10-06 14:13:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="P92-1016">
  <Title>UNDERSTANDING NATURAL LANGUAGE INSTRUCTIONS: THE CASE OF PURPOSE CLAUSES</Title>
  <Section position="4" start_page="0" end_page="120" type="metho">
    <SectionTitle>
PURPOSE CLAUSES
</SectionTitle>
    <Paragraph position="0"> I am not the first one to analyze purpose clauses: however, they have received attention almost exclusively from a syntactic point of view - see for example (Jones, 1985), (l-Iegarty, 1990). Notice that I am not using the term purpose clause in the technical way it has been used in syntax, where it refers to infinitival to clauses adjoined to NPs. In contrast, the infinitival clauses I have concentrated on are adjoined to a matrix clause, and are termed rational clauses in syntax; in fact all the data I will discuss in this paper belong to a particular subclass of such clauses, subject-gap rational clauses.</Paragraph>
    <Paragraph position="1"> As far as I know, very little attention has been paid to purpose clauses in the semantics literature: in (1990), Jackendoff briefly analyzes expressions of purpose, goal, or rationale, normally encoded as an infinitival, in order  to-phrase, or for-phrase. He represents them by means of a subordinating function FOR, which has the adjunct clause as an argument; in turn, FOR plus its argument is a restrictive modifier of the main clause. However, Jackendoff's semantic decomposition doesn't go beyond the construction of the logical form of a sentence, and he doesn't pursue the issue of what the relation between the actions described in the matrix and adjunct really is. The only other work that mentions purpose clauses in a computational setting is (Balkanski, 1991). However, she doesn't present any linguistic analysis of the data; as I will show, such analysis raises many interesting issues, such as t: * It is fairly clear that S uses purpose clauses to explain to H the goal/~ to whose achievement the execution of contributes. However, an important point that had been overlooked so far is that the goal/~ also constrains the interpretation of ~, as I observed with respect to Ex. 1.b. Another example in point is: Ex. 2 Cut the square in half to create two triangles.</Paragraph>
    <Paragraph position="2"> The action to be performed is cutting the square in half. However, such action description is underspecified, in that there is an infinite number of ways of cutting a square in half: the goal create two triangles restricts the choice to cutting the square along one of the two diagonals.</Paragraph>
    <Paragraph position="3"> * Purpose clauses relate action descriptions at different levels of abstraction, such as a physical action and an abstract process, or two physical actions, but at different levels of granularity: Ex. 3 Heat on stove to simmer.</Paragraph>
    <Paragraph position="4"> * As far as what is described in purpose clauses, I have been implying that both matrix and purpose clauses describe an action, c~ and/~ respectively. There are rare cases - in fact, I found only one - in which one of the two clauses describes a state ~r: Ex. 4 To be successfully covered, a wood wall must be flat and smooth.</Paragraph>
    <Paragraph position="5"> I haven't found any instances in which both matrix and purpose clauses describe a state. Intuitively, this makes  sense because S uses a purpose clause to inform H of the purpose of a given action 2 * In most cases, the goal /~ describes a change in the world. However, in some cases 1. The change is not in the world, but in H's knowl null edge. By executing o~, H can change the state of his knowledge with respect to a certain proposition or to the value of a certain entity.</Paragraph>
    <Paragraph position="6"> 1I collected one hundred and one consecutive instances of purpose clauses from a how-to-do book on installing wall coverings, and from two craft magazines.</Paragraph>
    <Paragraph position="7"> ~There are clearly other ways of describing that a state is the goal of a certain action, for example by means of so~such that, but I won't deal with such data here.</Paragraph>
    <Paragraph position="8"> Ex. 5 You may want to hang a coordinating border around the room at the top of the walls. To determine the amount of border, measure the width (in feet) of all walls to be covered and divide by three.</Paragraph>
    <Paragraph position="9"> Since borders are sold by the yard, this will give you the number of yards needed.</Paragraph>
    <Paragraph position="10"> Many of such examples involve verbs such as check, make sure etc. followed by a thatcomplement describing a state ~b. The use of such verbs has the pragmatic effect that not only does H check whether ~b holds, but, if ~b doesn't hold, s/he will also do something so that ff comes to hold.</Paragraph>
    <Paragraph position="11"> Ex. 6 To attach the wires to the new switch, use the paper clip to move the spring type clip aside and slip the wire into place. Tug gently on each wire to make sure it's secure.</Paragraph>
    <Paragraph position="12"> 2. The purpose clause may inform H that the world should not change, namely, that a given event should be prevented from happening: Ex. 7 Tape raw edges of fabric to prevent threads from raveling as you work.</Paragraph>
    <Paragraph position="13"> * From a discourse processing point of view, interpreting a purpose clause may affect the discourse model, in particular by introducing new referents. This happens when the effect of oL is to create a new object, and/~ identifies it. Verbs frequently used in this context are create, make, form etc.</Paragraph>
    <Paragraph position="14"> Ex. 8 Join the short ends of the hat band to form a circle. Similarly, in Ex. 2 the discourse referents for the triangles created by cutting the square in half, and in Ex. 5 the referent for amount of border are introduced.</Paragraph>
  </Section>
  <Section position="5" start_page="120" end_page="120" type="metho">
    <SectionTitle>
RELATIONS BETWEEN ACTIONS
</SectionTitle>
    <Paragraph position="0"> So far, I have mentioned that oe contributes to achieving the goal/~. The notion of contribution can be made more specific by examining naturally occurring purpose clauses. In the majority of cases, they express generation, and in the rest enablement. Also (Grosz and Sidner, 1990) use contribute as a relation between actions, and they define it as a place holder for any relation ...</Paragraph>
    <Paragraph position="1"> that can hold between actions when one can be said to contribute (for example, by generating or enabling) to the performance of the other. However, they don't justify this in terms of naturally occurring data. Balkanski (1991) does mention that purpose clauses express generation or enablement, but she doesn't provide evidence to support this claim.</Paragraph>
  </Section>
  <Section position="6" start_page="120" end_page="121" type="metho">
    <SectionTitle>
GENERATION
</SectionTitle>
    <Paragraph position="0"> Generation is a relation between actions that has been extensively studied, first in philosophy (Goldman, 1970) and then in discourse analysis (Allen, 1984), (Pollack, 1986), (Grosz and Sidner, 1990), (Balkanski, 1990).</Paragraph>
    <Paragraph position="1"> According to Goldman, intuitively generation is the relation between actions conveyed by the preposition by in English - turning on the light by flipping the switch.</Paragraph>
    <Paragraph position="2">  More formally, we can say that an action a conditionally  generates another action/~ iff 3: 1. a and/~ are simultaneous; 2. a is not part of doing/~ (as in the case of playing a C note as part of playing a C triad on a piano); 3. when a occurs, a set of conditions C hold, such that  the joint occurrence of a and C imply the occurrence of/L In the case of the generation relation between flipping the switch and turning on the light, C will include that the wire, the switch and the bulb are working.</Paragraph>
    <Paragraph position="3"> Although generation doesn't hold between o~ and fl if is part of a sequence of actions ,4 to do/~, generation may hold between the whole sequence ,4 and/~.</Paragraph>
    <Paragraph position="4"> Generation is a pervasive relation between action descriptions in naturally occurring data. However, it appears from my corpus that by clauses are used less frequently than purpose clauses to express generation 4: about 95% of my 101 purpose clauses express generation, while in the same corpus there are only 27 by clauses. It does look like generation in instructional text is mainly expressed by means of purpose clauses. They may express either a direct generation relation between and/~, or an indirect generation relation between and/~, where by indirect generation I mean that ~ belongs to a sequence of actions ,4 which generates 8.</Paragraph>
  </Section>
  <Section position="7" start_page="121" end_page="121" type="metho">
    <SectionTitle>
ENABLEMENT
</SectionTitle>
    <Paragraph position="0"> Following first Pollack (1986) and then Balkanski (1990), enablement holds between two actions ~ and /~ if and only if an occurrence of ot brings about a set of conditions that are necessary (but not necessarily sufficien 0 for the subsequent performance of 8.</Paragraph>
    <Paragraph position="1"> Only about 5% of my examples express enablement: Ex. 9 Unscrew the protective plate to expose the box.</Paragraph>
    <Paragraph position="2"> Unscrew the protective plate enables taking the plate off which generates exposing the box.</Paragraph>
  </Section>
  <Section position="8" start_page="121" end_page="122" type="metho">
    <SectionTitle>
GENERATION AND ENABLEMENT IN
MODELING ACTIONS
</SectionTitle>
    <Paragraph position="0"> That purpose clauses do express generation and enablement is a welcome finding: these two relations have been proposed as necessary to model actions (Allen, 1984), (Pollack, 1986), (Grosz and Sidner, 1990), (Balkanski, 1990), but this proposal has not been justiffed by offering an extensive analysis of whether and how these relations are expressed in NL.</Paragraph>
    <Paragraph position="1">  A further motivation for using generation and enablement in modeling actions is that they allow us to draw conclusions about action execution as well - a particularly useful consequence given that my work is taking place in the framework of the Animation from Natural Language - AnimNL project (Badler eta/., 1990; Webber et al., 1991) in which the input instructions do have to be executed, namely, animated.</Paragraph>
    <Paragraph position="2"> As has already been observed by other researchers, ff generates /~, two actions are described, but only a, the generator, needs to be performed. In Ex. 2, there is no creating action per se that has to be executed: the physical action to be performed is cutting, constrained by the goal as explained above.</Paragraph>
    <Paragraph position="3"> In contrast to generation, if a enables/~, after executing or, fl still needs to be executed: a has to temporally precede/~, in the sense that a has to begin, but not necessarily end, before/3. In Ex. 10, ho/d has to continue for the whole duration offal/: Ex. 10 Hold the cup under the spigot to fill it with coffee. Notice that, in the same way that the generatee affects the execution of the generator, so the enabled action affects the execution of the enabling action. Consider the difference in the interpretation of to in go to the mirror, depending upon whether the action to be enabled is seeing oneself or carrying the mirror somewhere else.</Paragraph>
  </Section>
  <Section position="9" start_page="122" end_page="123" type="metho">
    <SectionTitle>
INFERENCE PROCESSES
</SectionTitle>
    <Paragraph position="0"> So far, I have been talking about the purpose clause constraining the interpretation of the matrix clause. I will now provide some details on how such constraints are computed. The inferences that I have identified so far as necessary to interpret purpose clauses can be de- null scribed as 1. Computing a more specific action description. 2. Computing assumptions that have to hold for a cer null tain relation between actions to hold.</Paragraph>
    <Paragraph position="1"> Computing more specific action descriptions. In Ex. 2 - Cut the square in half to create two triangles - it is necessary to find a more specific action al which will achieve the goal specified by the purpose clause, as shown in Fig. 1.</Paragraph>
    <Paragraph position="2"> For Ex. 2 we have fl = create two triangles, o~ = cut the square in half, ~1 = cut the square in half along the diagonal. The reader will notice that the inputs to accommodation are linguistic expressions, while its outputs are predicate - argument structures: I have used the latter in Fig. 1 to indicate that accommodation infers relations between action types. However, as I will show later, the representation I adopt is not based on predicate - argument structures. Also notice that I am using Greek symbols for both linguistic expressions and action types: the context should be sufficient to disambiguate which one is meant.</Paragraph>
    <Paragraph position="3"> Computing assumptions. Let's consider:  Ex. 11 Go into the other room to get the urn of coffee. Presumably, H doesn't have a particular plan that deals with getting an urn of coffee. S/he will have a generic plan about get x, which s/he will adapt to the instructions S gives him 5. In particular, H has to find the connection between go into the other room and get the urn of coffee. This connection requires reasoning about the effects of go with respect to the plan get x; notice that the (most direc0 connection between these two actions requires the assumption that the referent of the urn of coffee is in the other room. Schematically, one could represent this kind of inference as in Fig. 2 -/~ is the goal, ~ the instruction to accommodate, Ak the actions belonging to the plan to achieve t, C the necessary assumptions. It could happen that these two kinds of inference need to be combined: however, no example I have found so far requires it.</Paragraph>
    <Paragraph position="4"> INTERPRETING Do a to do I~ In this section, I will describe the algorithm that im5Actually H may have more than one single plan for get x,. in which case go into the other room may in fact help to select the plan the instructor has in mind.</Paragraph>
    <Paragraph position="5">  plements the two kinds of accommodation described in the previous section. Before doing that, I will make some remarks on the action representation I adopt and on the structure of the intentions - the plan graph - that my algorithm contributes to building.</Paragraph>
    <Paragraph position="6"> Action representation. To represent action types, I use an hybrid system (Brachman et al., 1983), whose primitives are taken from Jackendoff's Conceptual Structures (1990); relations between action types are represented in another module of the system, the action library.</Paragraph>
    <Paragraph position="7"> I'd like to spend a few words justifying the choice of an hybrid system: this choice is neither casual, nor determined by the characteristics of the AnimNL project. Generally, in systems that deal with NL instructions, action types are represented as predicate - argument structures; the crucial assumption is then made that the logical form of an input instruction will exactly match one of these definitions. However, there is an infinite number of NL descriptions that correspond to a basic predicate - argument structure: just think of all the possible modifiers that can be added to a basic sentence containing only a verb and its arguments. Therefore it is necessary to have a flexible knowledge representation system that can help us understand the relation between the input description and the stored one. I claim that hybrid KR systems provide such flexibility, given their virtual lattice structure and the classification algorithm operating on the lattice: in the last section of this paper I will provide an example supporting my claim.</Paragraph>
    <Paragraph position="8"> Space doesn't allow me to deal with the reason why Conceptual Structures are relevant, namely, that they are useful to compute assumptions. For further details, the interested reader is referred to (Di Eugenio, 1992; Di Eugenic) and White, 1992).</Paragraph>
    <Paragraph position="9"> Just a reminder to the reader that hybrid systems have two components: the terminological box, or T-Box, where concepts are defined, and on which the classification algorithm works by computing subsumption relations between different concepts. The algorithm is crucial for adding new concepts to the KB: it computes the subsumption relations between the new concept and all the other concepts in the lattice, so that it can &amp;quot;Position&amp;quot; the new concept in the right place in the lattice. The other component of an hybrid system is the assertional box, or A-box, where assertions are stored, and which is equipped with a theorem-prover.</Paragraph>
    <Paragraph position="10"> In my case, the T-Box contains knowledge about action types, while assertions about individual actions instances of the types - are contained in the A-Box: such individuals correspond to the action descriptions contained in the input instructions 6 The action library contains simple plans relating actions; simple plans are either generation or enablement relations between pairs: the first member of the pair is either a single action or a sequence of action, and the second member is an action. In case the first member of the pair is an individual action, I will talk about direct generation or enablement. For the moment, generation and enablement are represented in a way very similar to (Balkanski, 1990).</Paragraph>
    <Paragraph position="11"> The plan graph represents the structure of the intentions derived from the input instructions. It is composed of nodes that contain descriptions of actions, and arcs that denote relations between them. A node contains the Conceptual Structures representation of an action, augmented with the consequent state achieved after the execution of that action. The arcs represent, among others: temporal relations; generation; enablement.</Paragraph>
    <Paragraph position="12"> The plan graph is built by an interpretation algorithm that works by keeping track of active nodes, which for the moment include the goal currently in focus and the nodes just added to the graph; it is manipulated by various inference processes, such as plan expansion, and plan recognition.</Paragraph>
    <Paragraph position="13"> My algorithm is described in Fig. 3 7. Clearly the inferences I describe are possible only because I rely ~Notice that these individuals are simply instances of generic concepts, and not necessarily action tokens, namely, nothing is asserted with regard to their happening in the world. rAs I mentioned earlier in the paper, the Greek symbols on the other AnimNL modules for 1) parsing the input and providing a logical form expressed in terms of Conceptual Structures primitives; 2) managing the discourse model, solving anaphora, performing temporal inferences etc (Webber eta/., 1991).</Paragraph>
  </Section>
  <Section position="10" start_page="123" end_page="126" type="metho">
    <SectionTitle>
AN EXAMPLE OF THE ALGORITHM
</SectionTitle>
    <Paragraph position="0"> I will conclude by showing how step 4a in Fig. 3 takes advantage of the classification algorithm with which hybrid systems are equipped.</Paragraph>
    <Paragraph position="1"> Consider the T-Box, or better said, the portion of T-Box shown in Fig. 4 s.</Paragraph>
    <Paragraph position="2"> Given Ex. 2 - Cut the square in half to create two triangles - as input, the individual action description cut (the) square in half will be asserted in the A-Box and recognized as an instance of ~ - the shaded concept cut (a) square in half - which is a descendant of cut and an abstraction of o: - cut (a) square in half along the diagonal, as shown in Fig. 5 9. Notice that this does not imply that the concept cut (a) square in half is known beforehand: the classification process is able to recognize it as a virtual concept and to find the right place for it in the lattice 10. Given that a is ancestor of o J, and that oJ generates/~ - create two triangles, the fact that the action to be performed is actually o~ and not oL can be inferred. This implements step 4(a)ii.</Paragraph>
    <Paragraph position="3"> The classification process can also help to deal with cases in which ~ is in conflict with to - step 4(a)iv. If were cut (a) square along a perpendicular axis, a conflict with o~ - cut (a) square in half along the diagonal - would be recognized. Given the T-Box in fig. 4, the classification process would result in o~ being a sister to w: my algorithm would try to unify them, but this would not be possible, because the role fillers of location on and w cannot be unified, being along(perpendicularaxis) and along(diagonal) respectively. I haven't addressed the issue yet of which strategies to adopt in case such a conflict is detected.</Paragraph>
    <Paragraph position="4"> Another point left for future work is what to do when step 2 yields more than one simple plan.</Paragraph>
    <Paragraph position="5"> The knowledge representation system I am using is BACK (Peltason et al., 1989); the algorithm is being implemented in QUINTUS PROLOG.</Paragraph>
    <Paragraph position="6"> refer both to input descriptions and to action types.</Paragraph>
    <Paragraph position="7"> SThe reader may find that the representation in Fig. 4 is not very perspicuous, as it mixes linguistic expressions, such as along(diagonal), with conceptual knowledge about entities. Actually, roles and concepts are expressed in terms of Conceptual Structures primitives, which provide a uniform way of representing knowledge apparently belonging to different types. However, a T-Box expressed in terms of Conceptual Structures becomes very complex, so in Fig. 4 I adopted a more readable representation.</Paragraph>
    <Paragraph position="8">  Input: the Conceptual Structures logical forms for ~ and t, the current plan graph, and the list of active nodes.  1. Add to A-Box individuals corresponding to the two logical forms. Set flag ACCOM if they don't exactly match known concepts.</Paragraph>
    <Paragraph position="9"> 2. Retrieve from the action library the simple plan(s) associated with /5 - generation relations in which /5 is the generate., enablement relations in which/5 is the enablee.</Paragraph>
    <Paragraph position="10"> 3. If ACCOM is not set (a) If there is a direct generation or enablement relation between ~ and/5, augment plan graph with the structure derived from it, after calling compute-assumptions.</Paragraph>
    <Paragraph position="11"> (b) If there is no such direct relation, recursively look for possible connections between e and the components 7i of sequences that either generate or enable/5.</Paragraph>
    <Paragraph position="12"> Augment plan graph, after calling c omput e- a s s umpt i on s. 4. If ACCOM is set, (a) If there is ~a such that oJ directly generates or enables/5, check whether  i. w is an ancestor of c~: take c~ as the intended action.</Paragraph>
    <Paragraph position="13"> ii. ~o is a descendant of c~: take o~ as the intended action. iii. If w and e are not ancestors of each other, but they can be unified - all the information they provide is compatible, as in the case of cut square in half along diagonal and cut square carefully - then their unification w U c~ is the action to be executed.</Paragraph>
    <Paragraph position="14"> iv. If o: and ~ are not ancestors of each other, and provide conflicting information - such as cut square along diagonal and cut square along perpendicular axis - then signal failure. (b) If there is no such w, look for possible connections between ~ and the components 7i of sequences that either generate or enable/5, as in step 3b. Given that ~ is not known to the system, apply the inferences described in 4a to c~ and 7/.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML