File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/75/t75-2017_metho.xml
Size: 20,515 bytes
Last Modified: 2025-10-06 14:11:12
<?xml version="1.0" standalone="yes"?> <Paper uid="T75-2017"> <Title>I I I ! i I l I I I ! I I ! I I I I I A FORMALISM FOR RELATING LEXICAL AND PRAGMATIC INFORMATION: ITS RELEVANCE TO RECOGNITION AND GENERATION*</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> A FORMALISM FOR RELATING LEXICAL AND PRAGMATIC INFORMATION: ITS RELEVANCE TO RECOGNITION AND GENERATION* </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> I. INTRODUCTION </SectionTitle> <Paragraph position="0"> In this paper we shall report on an initial attempt to relate the representation problem for four areas to each other through the use of a uniform formal structure. The four areas we have been concerned with are: (I) interpretation of events (2) initiation of actions (3) understanding language (4) using language Finding such a representation would be extremely useful and very suggestive even though it would not by itself constitute a solution to the whole problem.</Paragraph> <Paragraph position="1"> Clearly, (I) and (2) are &quot;pragmatic&quot; in nature and are not limited to natural language processing, while (3) and (4) may be viewed as special cases of (I) and (2) respectively. One of our main goals is to show how both pragmatic and semantic issues may be approached in a formal framework. We have chosen to study the area of &quot;speech acts&quot; (conversational activities like &quot;request,&quot; &quot;command,&quot; &quot;promise,&quot; ...) as this area is especially rich in interactions among the four areas.</Paragraph> <Paragraph position="2"> Our goals can be divided into two categories: operational and methodological. On the operational side, we want to implement an actual system which would recognzze&quot; and &quot;perform&quot; speech acts and which would use and understand the verbs of &quot;saying'. The recognition that a particular speech act has occurred is to be on the basis of context and not solely on explicit markers like a performative verb or a question mark. We also want a symmetric system which could generate, in the context of reversed roles, anything it could understand. Initially we would be satisfied that the input and output be in an artificial language which we felt to be adequate to represent the underlying structures of English sentences (I).</Paragraph> <Paragraph position="3"> On the methodological side, we have two primary desiderata: unformity of representation, and generality in the procedural component. We do not wish to write an intricate procedure for each speech act. We want to represent the speech acts in a structure with useful formal properties. (We settled on the lattice.) We *This work was partially supported by NSF</Paragraph> </Section> <Section position="3" start_page="0" end_page="80" type="metho"> <SectionTitle> GRANT SOC 72-05465A01. </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="80" type="sub_section"> <SectionTitle> **Department of Computer and Information </SectionTitle> <Paragraph position="0"> Science, The Moore School of Electrical Engineering, University of Pennsylvania, Philadelphia, 19174 (I) Our representations are compatible with the output of the parser currently being designed and implemented by Ralph Weischedel for the computation of presuppositions and entailments. 79 want the &quot;state of the system&quot; to be a mathematically tractable object as well.</Paragraph> <Paragraph position="1"> The heart of the procedural component is to consist of straightforward (algebraic) operations and relations (LUB, GLB, i) which could be related to certain cognitive and linguistic phenomena.</Paragraph> <Paragraph position="2"> A system designed along these lines is being implemented in LISP.</Paragraph> <Paragraph position="3"> II. RELATED RESEARCH (2) This work cuts across several areas in linguistics, natural language processing, and artificial intelligence and is related to work done on &quot;lexical factorization&quot; by certain generative semanticists and others. Here, as there, the attempt was to decompose the meanings of various predicates into combinations of a small group of &quot;core predicates&quot; or &quot;primitives'. However, whereas in general the decomposition was allowed to be expressed in any suitable form (trees, dependency networks, ...) we shall decompose into a slightly extended predicate calculus in order to exploit the underlying Boolean algebra and ultimately to construct our derived lattice.</Paragraph> <Paragraph position="4"> At this point, we should mention two related pieces of work. First, the notion of using lattices for &quot;recognizing&quot; or &quot;characterizing&quot; events is an extension of some ideas of Tidhar \[T74\] (see also \[BT75\]) who applied the principle to visual recognition of shapes. Also, Smaby's work \[$74\] on presupposition makes considerable use of lattice constructions, after Dana Scott, in a somewhat related spirit.</Paragraph> <Paragraph position="5"> III. OVERVIEW OF THE SYSTEM In Figure I we present a block diagram of the system.</Paragraph> <Paragraph position="7"> The block which stands for the procedural component is labeled CONTROL; all the rest are data structures. The SCHEMATA block contains the lattice whose points consist of (2) A more detailed review of related research will be included in the final version of this paper. Some examples are \[B75\], \[F71\], \[JM75\], \[J74\], \[KP?5\], \[Sch73\], \[Sc72\], \[St74\], \[W72\].</Paragraph> <Paragraph position="8"> the lexical decompositions (definitions) and certain other elements while the LEXICON contains the non-definitional information.</Paragraph> <Paragraph position="9"> The LEXICON and SCHEMATA remain fixed durin~ the course of the conversation. The &quot;state or &quot;instantaneous description&quot; of the system is to be found in the BELIEFS and GOALS, which are constantly being updated as the conversation progresses.</Paragraph> <Paragraph position="10"> In order to avoid confusion, we should point out that in our discussion of the system, &quot;beliefs&quot; and &quot;goals&quot; are meant as technical terms to be defined entirely by their function in the system. These terms are not to be confused with their corresponding lexical items. We shall have more to say about &quot;goals&quot; later, but for now we will concentrate on &quot;beliefs'.</Paragraph> <Paragraph position="11"> At any given time, the system has as its &quot;beliefs&quot; a set of propositions in a predicate calculus slightly modified primarily to allow for sentence embeddings.</Paragraph> <Paragraph position="12"> This set has the following properties: (i) closure -- if a proposition is in the belief set, then all its direct consequences (i.e., those following from the definitions of the lexical items) are also in the belief set.</Paragraph> <Paragraph position="13"> (2) consistency -- the Boolean product of the propositions in the belief set cannot be the element &quot;false'.</Paragraph> <Paragraph position="14"> In order to briefly illustrate these restrictions, consider the definition: bachelQr (x) =~ man (x) & - married (x), and the following sets: (1) {bachelor(John), man(John)} (2) {bachelor(John), -married(John), -man(John)} (3) {bachelor(John),</Paragraph> <Paragraph position="16"> consistent. Set (3) is closed and consistent and is thus a valid belief set. Note that the direct consequence relation defines a partial order over the propositions. The addition of propositions which are direct consequences of a proposition containing a defined predicate, we call EXPANSION. There is another operation which is something of an inverse of EXPANSION: Given a valid set of beliefs, this operaton augments the belief set with the least summarizing expression(s) having as consequences any two-or-more element subset of the original beliefs. This operation we call SYNTHESIS. For instance, given the set {man(John), -married(John)} the performance of SYNTHESIS would yield the set {bachelor(John), -married(John)}.</Paragraph> <Paragraph position="17"> man(John), In this example, the original set corresponded exactly to the clauses of the definition, but in general this would not be the case; other beliefs might also be entailed by the added proposition(s), and these would also have to be added. (Closure and consistency must still be preserved.) The next section deals operatons can be defined implications are for understanding system.</Paragraph> <Paragraph position="18"> with how such and what their a flexible IV. BOOLEAN ALGEBRAS AND LATTICES We begin by giving a brief exposition of Boolean algebras as representing information states to be followed by an explanation of how, by constructing a lattice substructure, we can formalize the notion of matching a pattern on incomplete information. The lattice will supply an internal criterion for deciding when there is enough information for a match.</Paragraph> <Paragraph position="19"> Assume we are given a finite set of (primitive) predicates, each of known degree. Assume further that these predicates are to be applied to a finite set of constants. A predicate of degree n adjoined to n constants is an atomic proposition, and the negation symbol attached to an unnegated atomic proposition also yields an atomic proposition. We can think of all atomic sentences, their conjunctions and disjunctions, together with a &quot;greatest&quot; element * and a ~least&quot; element 0, as forming a Boolean algebra, Bool. In this algebra eery element (except * and 0) is written as a sum-of-products of atomic propositions.</Paragraph> <Paragraph position="20"> We define the &quot;less-than-or-equal&quot; relation (~) as follows: (I) ~ x~ Bool, x~* (2) W x~ Bool, 0~x (3) If x is a product term x~x~...x~ and y is a product term y~y~,...,y~ then x~y iff Wx~ ~ Yi such that x~ is identical to y~ (i.e., the literals of x are a subset of the literals of y).</Paragraph> <Paragraph position="21"> (4) If s is a sum-of-products term s~ + s z + * .. + so and ti is a sum-of-products</Paragraph> <Paragraph position="23"> Following Dana Scott \[Sc72\], we identify the meet (M) with disjunction of elements and the join (u) with conjunction. With this convention we get the interpretation that as we go &quot;upward&quot; in the structure we get elements containing more information. The maximal element, *, is &quot;overdetermined&quot; in the sense that it contains &quot;too much&quot; information; it is self-contradictory.</Paragraph> <Paragraph position="24"> Conversely, the lower elements in the structure contain less information, with the minimal element, 0, containing no information at all. These notions are presented graphically in Figure 2.</Paragraph> <Paragraph position="26"/> </Section> </Section> <Section position="4" start_page="80" end_page="81" type="metho"> <SectionTitle> CONJUNCTION OF CONDITIONS * MORE INFORMATION /~~ </SectionTitle> <Paragraph position="0"> instantiated propositions, and as such it would not be of direct use in &quot;pattern matching. By adopting certain conventions having to do with variables and their substitutions, we can define T, the Boolean algebra of predicates (or uninstantiated logical forms), which, of course, would become Bool if constants were to replace the variables. It is from this structure T, that we construct the lattice of schemata.</Paragraph> <Paragraph position="1"> The construction of this lattice proceeds as follows. We select from the Boolean algebra T those points which correspond to combinations of conditions which we wish to have serve as &quot;paradigms&quot; or &quot;schemata'. The choice of these points has to do with the empirical question of what clusters, of properties and relations are of cognitive significance, which are EXPANSION's of lexical items, and so on.</Paragraph> <Paragraph position="2"> Any arbitrary set of points drawn from T can be completed to a lattice L by adding additional points from T such that for any two points x t and xz in L, x~ x L will also be in L. While this is the general procedure, we have been working primarily with lattices that have no elements -- other than 0 -- that are strictly less than the elements corresponding to atomic predicates.</Paragraph> <Paragraph position="3"> We write A(x) if x is an element of T and x corresponds to an atomic predicate. We write ~(x) if there exists a y such that A(y) and y~x. That is, ~(x) if x is &quot;at least atomic'.</Paragraph> <Paragraph position="4"> The ~ relation is inherited directly from T, as is the operation ~ (3).</Paragraph> <Paragraph position="5"> However, the operation differs in that, intuitively, one may get out more than was put in. That is, in T if t~ and t z are product terms and t,~-~ t, = t, then for any element t&quot; such that A(t') if t'~t, then either t'~tt or t'~t~. However, in L this is not always the case. For example, in (3) In the case of a lattice in which for all elements x, other than 0, A(x) is true, the following modification is necessary: If</Paragraph> <Paragraph position="7"> D~S~ , while C and D are not comparable to either A or B. Thus, while in T we could move our information state &quot;forward', in L we can move forward and reasonably extend our information beyond what was strictly given.</Paragraph> <Paragraph position="8"> \ Figure 3 Intuitively speaking, we have absorbed the non-paradigmatic information states to paradigm points; ~L corresponds to &quot;jumping to a conclusion&quot; -- but only to the least conclusion which is needed to explain the givens. The criteria for how much to extend are in the structure itself.</Paragraph> <Paragraph position="9"> The actual computation of x L.~Ly is not difficult, given that we have ~ and ~ from T. One method follows from the observation that the least upper bound is the greatest lower bound of all upper bounds and that X~-~y~x~Ly. By this method one first computes t, the least upper bound in T.</Paragraph> <Paragraph position="10"> (This is straightforward, asT is a Boolean algebra.) Set r to *. Then for each element x of L for which t~x, set r to r~x. When we exhaust all such x, the value of r will be the least upper bound. Of course, other more efficient methods for computing the l.u.b, also exist.</Paragraph> <Paragraph position="11"> The mechanism for event interpretation operates in the following manner. The least upper bound is taken of the points in the lattice which, under variable substitution, correspond to the propositions in the belief set and propositions in some input set. Any matched schemata (and their consequences) are added to the belief set. If the least upper bound taken in this way turns out to be *, one of two things has occured. Either the belief set contained a proposition which contradicted an input proposition, (the belief set, one should recall, could never be self-contradictory), or there is no single schema which encompasses all the propositional information. In the former case, a control decision must be made on how to integrate the new material into the belief set. Inthe latter case, we use the operation &quot;generalized LUB', which returns a set of points, each of which is a l.u.b.</Paragraph> <Paragraph position="12"> for a subset of the propositions.</Paragraph> <Paragraph position="13"> V. LINGUISTIC RELEVANCE As was noted before, an attempt was made to correlate the schemata with lexical decompositions of English words, especially the verbs of &quot;saying'. It can be seen that definitional direct consequences (a type of entailment) corresponds precisely to the relation. That is, the fact that a sentence using the defined predicate bache!en has man as its direct consequence implies that the point in L into which man is mapped is less-than-or-equal-to (~) the point into which bachelor is mapped. If we label points in the lattice with items from the lexicon, we get structures similar to the one shown in Figure 4. Detailed information about the arguments of each predicate has been left out for the sake of readability.</Paragraph> <Paragraph position="14"> \%his * I REQUEST ~ PRO</Paragraph> </Section> <Section position="5" start_page="81" end_page="82" type="metho"> <SectionTitle> \ ATE X(KNOW SAY KNOW AS-A- I </SectionTitle> <Paragraph position="0"> The reason for embedding lexical items in the lattice is that the l.u.b, operation can be used to choose appropriate words to describe a situation (given as a &quot;belief set'). That is, we want the act of word selection to be identified with an operation that is naturally suggested by the formal structure. The selection of groups of words is identified with the &quot;generalized LUB.&quot; One interesting challenge emanating from this approach was to find a way in which well-known semantic properties of lexical items, such as induced presuppositions, could be integrated into the framework. For this purpose we introduced a new connective, @, whose behavior is illustrated in Figure 5.</Paragraph> <Paragraph position="2"> Figure 5 If ~ is taken to be the presupposition and A the assertion, then the two negation rewritings correspond to the usual understanding of presupposition. However both can be expressed as points in the Boolean algebra. Furthermore, if S is a sentence rewritten as a 9 b, then neg(S) not(S) (since ~a + ~b ~ a & ~b.) Also, if A(a) and A(b) (i.e., if a and b are atomic) then S and not(S) are higher in the lattice than the atomic sentences, but neg(S) is lower.</Paragraph> <Paragraph position="3"> Recalling that moving upward in the structure is related to more specific information,&quot; some light is cast on the function of presupposition as allowing the general direction of information to be preserved even under negation of a sentence containing a complex predicate. If there were no presuppositional convention, we would move downward in information, since we know only that some component in the complex is false. With presuppositions, however, we know exactly which compQnent is to be negated, so we keep the conjunction of clauses and hence move &quot;upward.&quot; VI. THE INITIATON OF EVENTS Under the appropriate interpretation of the schemata we can represent how goals are set, changed, and accomplished. The essential notion is that a schema can represent a conjunction of pre-conditions, actions, and post-conditions. In this circumstance, if the &quot;belief set&quot; and the &quot;goal set&quot; satisfy enough pre- and post-conditions respectively for a particular schema to be matched by the l.u.b, operation, then the action may be taken. Of course, in the case of complete information (perfect match) the use of the schemata reduces to conditional expressions and as such is sufficient to represent any sequence of actions -- or to perform any computation. What is more interesting, however, is how the lattice provides a model of &quot;intelligent&quot; or &quot;appropriate&quot; choice of actions in the case of incomplete information. In this context, too, the &quot;generalized LUB&quot; plays a role, namely that of selecting several compatible actions to be performed.</Paragraph> </Section> <Section position="6" start_page="82" end_page="82" type="metho"> <SectionTitle> VII. CONCLUSION </SectionTitle> <Paragraph position="0"> We have attempted to show how the lattice operations can be used in a variety of closely related linguistic and artificial intelligence contexts in such a way as to exploit the relationships effectively. What has not been shown here is the control structure which sequences the operations of interpretation and initiation of events (including linguistic events). A theoretically satisfying strategy has not yet been settled upon, though we have been exploring the implications of several candidate strategies. These strategies, together with the formal operations described above, are being implemented in LISP, and preliminary results suggest that such a lattice-structured system is feasible and very promising.</Paragraph> </Section> class="xml-element"></Paper>