File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-1412_metho.xml
Size: 28,138 bytes
Last Modified: 2025-10-06 14:15:13
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-1412"> <Title>I I I I I I 1 I I | I I I I I I I References</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> ABDUCTIVE REASON G FOR SYNTACTIC REALIZATION* </SectionTitle> <Paragraph position="0"> klabunde~novell i. gs.uni-heidelberg, de</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> Abductive reasoning is *used in a bidirectional framework for syntactic realization and semantic interpretation. The use of the framework is illustrated in a case study of sentence generation, where different syntactic forms are generated depending on the status of discourse information.</Paragraph> <Paragraph position="1"> Examples are given involving three differen t syntactic constructions in German root clauses.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 Pragmatics in Natural Language Generation </SectionTitle> <Paragraph position="0"> The computational treatment of pragmatics in natural language generation is often---directly or indirectly-* oriented around the Gricean maxims \[Grice 7*5\]. Their effects emerge from the pragmatic model of the generation system so that the generated text s satisfy these maxims. The texts should be a true characterization of a state of affairs, they should be as informative as possible, relevant, and perspicuous. While the first three maxims are related to what is said, the last maxim is related to how it is said. The category of perspicuity principles includes constraints on avoiding obscurity and ambiguity, or being brief and orderly.</Paragraph> <Paragraph position="1"> It is anything but clear how these principles should be interpreted precisely. Several attempts have been * made to remedy this in computational work on generating texts that best satisfy these maxims, especially with respect to the generation of referring expressions (e.g. \[Dale et al. 95\]).</Paragraph> <Paragraph position="2"> However, there is*more to pragmatics than satisfying Gricean maxims. In particular, the category of perspicuity principles does not usually cover the important fact that texts are tailored toa specific addressee, not only in content,* i.e., with respect to her or his informational needs, but also in the linguistic form, i.e., word order, syntactic constructions, the choice of lexical items, and eventually prosodic information. This tailoring of the linguistic form to the listener is termed &quot;information structuring&quot;. In generating texts, information Structuring requires, among other things, the use of some listener model, which may include information about the listener's knowledge, goals, properties, etc.</Paragraph> <Paragraph position="3"> LinguiStic approaches to describing the principles of information structuring have sometimes characterized information structure as an instruction to the listener about how to construct a model of the communicated state of affairs \[Prince 81\]~ In AI and Computational Linguistics, tailoring the message to the * - . Usercomprises very often solely content planning, which only indirectly determines the linguistic output.</Paragraph> <Paragraph position="4"> - : : ~:i. i . / For:~xaniple, Systems tailor the information &quot;density&quot; to the user (e.g.\[Pads 93\]), or they drive the dialogue * -, depending on an estimation of what the user might be interested in (e.g. \[Jameson et al. 94\]). Realizing texts . by determining the information structure of the respective sentences, which again is a reflex of addressee * orientation, has not yet.received its due attention.</Paragraph> <Paragraph position="5"> * *The authors would like to thank Bob Kasper, Nathan Vaillette, Shravan Vasishth, and twoanonymous referees for helpful comments and suggestions. All remaining mistake s are, of course,..our own.</Paragraph> <Paragraph position="7"/> </Section> <Section position="4" start_page="0" end_page="110" type="metho"> <SectionTitle> 2 The Topic/Comment Structure in Information Structuring </SectionTitle> <Paragraph position="0"> The notion of information structure comprises at least two separate notions of how the information of a sen, tence may be structured, viz. the topic/comment structure and the focus/background structure \[Vallduvf 92, Lambrecht 94\].t In order to motivate these structuring mechanisms, consider the following simple example.</Paragraph> <Paragraph position="1"> Suppose the purpose of a generation system is to describe a spatial scenario. One of the sentences might be (1) Behind the town hall is a BAKERY.</Paragraph> <Paragraph position="2"> with &quot;bakery&quot; the prosodically most prominent constituent (the focus exponent). In this sentence; the prepo- &quot; sitional phrase &quot;behind the town hall&quot; functions as topic and the noun phrase &quot;a bakery&quot; is in focus, we will ignore aspects of focus and its role in language generation, especially since selecting the focus exponent is better understood as being part of utterance planning, and are limiting our attention to the topic/commentstructure only. The topic provides familiar discourse referents whose properties are further illuminated by the sentence; the relation between these discourse referents and the sentential predication is also referred to as an aboutness-relation. Many languages possess special topicalization constructions or morphological markers to single out the topic in a sentence. In German (and probably English as well), referring phrases provide topicreferents, and the clause-initial position is their preferred position. Thus clause-initial positioning is the most important topic-relevant feature in generation. 2 The same propositional content expressed by (1) can be realized with different information structures and, therefore, different sentence forms, as the following English examples demonstrate: (2) A bakery is behind the town hall.</Paragraph> <Paragraph position="3"> (3) Behind the town hall, there is a bakery.</Paragraph> <Paragraph position="4"> (4) As for the town hal!, behind it is a bakery.</Paragraph> <Paragraph position="5"> Discourse referents functioning as topics must be identifiable for the listener. This is the reason why topics are usually packaged as definite noun phrases, or as prepositional phrases that contain definite noun ~hrases. Topic candidates will be selected from the set of discourse referents that the listener knows according to a topic acceptance scale. \[Lambrecht 94\] proposes the following scale: .</Paragraph> <Paragraph position="6"> (5) active > accessible (textually, situationally, or inferentially) > unused > brand-new anchored > brand-new unanchored Active referents are those that are currently lit up; they are in the center of attention. They are the most acceptable topics because the listener's mental effort needed for processing the respective sentence is-minimal as compared with the effort needed to identify and anchor an unfamiliar or inactive topic referent. We consider the candidates below the accessible referents to be inappropriate as topics in most instances, and we limit our attention to a scale with three regions: active referents, accessible referents, and inaccessible referents.</Paragraph> <Paragraph position="7"> To summarize, the first task in generating texts with sentences with appropriate topic/comment structures is to determine for each sentence the topic discourse referent. This referent should be identifiable for the listener and as high on the topic acceptance scale as possible. The phrase expressing the topic should I Depending on one's theoretical background and/or affiliation with different schools, terminology differs considerably. 2\[Ahmann 81, 150\] gives ~?ome counterexamples to this default. These examples are pr0sodically marked, however. be placed in clause-initial position. However, these are only guideiines, not fixed rules. Hence, we need a mechanism to handle this kind of uncertainty.</Paragraph> <Paragraph position="8"> This is of course not the whole story of topic-hood. In addition to selecting topic referents, we have to * solve the problem of how one and the same topic/comment structure can be realized by-different syntactic structures. German examples resembling the previous three ones are: In the first clause the topic is realized as the subject *in clause-initial position. The second clause exhibits a left dislocation for the topic, and the third one uses a so-called hanging topic.</Paragraph> <Paragraph position="9"> We assume that the functions of these three syntactic forms are more or less *identical for German and : English. All.three examples express the same propositional content, viz. the localization of a uniquely identifiable showcase with respect to a uniquely identifiable *lamp. Furthermore, all three examples exhibit the same topic/comment structure: &quot;the Showcase&quot; functions as topic, i.e., the anchor for the proposition, and the rest of the clause comments on certain aspects of the showcase. However, these three forms are not mutually interchangeable in each imaginable context, because they invite different pragmatic inferences. * The subject realization is neutral with respect to *topic accessibility. There is a strong correlation between the grammatical function of subjectand the information structural notion of topic. The subject is the * unmarked topic.</Paragraph> <Paragraph position="10"> Left dislocation constructions, they can indicate a topic shift because the syntactically autonomous position of the detached noun phrase signals a change in the status of its discourse referent from being inactive to active \[Lambrecht 94\]. Additionally, left dislocations must satisfy a presupposition condition, namely to support the existence of another individual not having the property expressed by the matrix clause \[Wiltschko 95\]. The discourse referent is in some way related to a previously established set which the referent is a member of. This resembles the presuppositions restrictive relative clauses establish. As fora hanging topic, it also indicates a topic shift. It introduces a new topic of the discourse from a set of discourse referents that have already been established in the discourse. The common property of shifting the discourse topic implies that hanging topics and left dislocations are not mutually exclusive. A distinction on pragmatic grounds is complicated by the fact that the various set phrases usable for the hanging topic can have different discourse functions and that left dislocations can be interpreted as special hanging topics. However, the main difference between left dislocations and hanging topics with the set phrase was das X betrifft (&quot;as for the X&quot;) seems to be: left dislocations must satisfy the presupposition condition and they establish a topic shift by means of changing the status of a discourse referen t, whereas i: hanging topics establish a topic shift by means of selecting a discourse referent from a previously established * set of referents. 3 Despite* their overlapping discourse functions, we confine ourselves to the distinctive pragmatic properties of both constructions for their generation.</Paragraph> <Paragraph position="11"> Hence, the second Problem that needs to be solved is to correlate the syntactic form with the status of discourse referents with respect to their *being. active or accessible, as well as with other discourse information and factual information pertaining to the presupposition conditions. How can we incorporate</Paragraph> <Paragraph position="13"> this informal characterization of topic, topic acceptability, and syntactic constructions into a unified and formally precise mechanism for a natural language generation system? We propose an abductive setting in the spirit of \[Hobbs et al. 93\] as a framework for integrating the diverse knowledge sources involved in the generation and interpretation of sentential information structure. The basic idea is 6a view generating a single proposition as finding the best proof for why a sentence and its information structure is congruent with the listener model. In the process of finding this proof, the sentence is generated by incrementally instantiating unbound variables.</Paragraph> <Paragraph position="14"> Our basic scenario is the generation of spatial descriptions. The mechanism for content planning is not the subject of this paper (cf. \[Jansche et al. 96, Meyer-Klab6nde 96, Porzel et al. in press\]). For spatial. descriptions, content planning comprises for each proposition the selection of a reference object from the set of objects, the selection of a primary object, the selection of a point of view, and the computation of a spatial relation between both objects depending on the chosen point of view. For present purposes we assume that thepropositional content of a sentence has already been established. What remains to be done is to construct a pragmatically appropriate sentence that conveys the new and informative part of this propositional information to the listener. It is for this syntactic/pragmatic realization process that we use the abductive framework. Ultimately, we aim to incorporate the abductive reasoning mechanism directly into the content planner so as to achieve a uniform framework.</Paragraph> </Section> <Section position="5" start_page="110" end_page="114" type="metho"> <SectionTitle> 3 * Generation by Abduction </SectionTitle> <Paragraph position="0"> Abductive reasoning is reasoning about the best explanation for a given observation. To make precise what counts as a good explanation, one introduces a preference criterion by which alternative explanations can be compared. A preferred explanation for an observation might be the least specific one, the most specific one, the one with the lowest proof costs, etc. Abductive explanation is classically characterized as follows (C f. \[Mayer et al. 96\]): a knowledge base K, the usual consequence relation ~, and an observation E to be explained, such that K t~ E, are given. A statement H is taken as the best explanation of E in K iff: 1. KU {H} ~ E; and 2. H is &quot;better&quot; than any other statement in the set {H ~ \[ K U {H'} ~ E}, according to the preference criterion.</Paragraph> <Paragraph position="1"> We use a generalized version of what \[Stickel 90, 236\] calls predicate specific abduction, where only elements from a distinguished set of litemls may be assumed. What counts as the best explanation will be based on the (preferably minimal)* number of assumptions made.</Paragraph> <Paragraph position="2"> Abduction has been used in natural language processing for interpretation tasks such as met0nymy resolution, understanding vague expressions, or plan recognition. Recently, abductive reasoning has also* been proposed for use in generation, partly for planning \[Lascarides et al. 92, Thomason et al. 96\], and as a framework for both the interpretation and the generation of discourse \[Thomasonet al. 97\]. The basic idea behind these approaches is to find the best way to obtain a communicative goal state by modifying the conversational record, which roughly corresponds to our listener model, with applicable operators. The plan is the set of hypotheses discovered by an abductive proof of the proposition that the goal state has been achieved.</Paragraph> <Paragraph position="3"> What remains open in these approaches is to make precise the relation between the planned propositional content of an Utterance and an appropriate sentence form. only very simple example sentences</Paragraph> <Paragraph position="5"> could be generated because the local pragmatics of the sentence form does not play a role in the previous approaches. We are bridging this pragmatic gap between content planning and surface realization by abductive mechanisms.</Paragraph> <Section position="1" start_page="111" end_page="112" type="sub_section"> <SectionTitle> 3.1 The Abductive Component </SectionTitle> <Paragraph position="0"> For our purposes it is helpful to view abductive proofs as essentially relational. An abductive proof d etermines the relation between a knowledge base, an observation, a specification of what assumptions can be made, and proved and assumed literals that jointly provide an explanation for the initial observation. The prototypes we have implemented inProlog make this relation available explicitly, and great care was taken to ensure that queries Such as (9), where not all arguments are instantiated, are handled correctly by generating a manageable subset of all possible solutions. &quot; (9) * ?- abduce(Goal, Assumable, *Proved, Assumed).</Paragraph> <Paragraph position="1"> In the above query, Goal is the observation to be proved by the abductive meta-interpreter, Assumable is a set of literals that may be assumed, and Proved and Assumed are multisets of literals that were used or assumed, respectively, during the abductive proof of Goal. Interpretation mode corresponds to queries where the goal is instantiated and everything can be assumed in principle, as in (10); during generation the meta-interpreter is invoked with the goal at least partially uninstantiated, while the set of assumable literals is specified, as in (11). * (10) ?- abduce (sentence ( \[die,vitrine, steht ; rechts,von, der, lampe\] ), u, Pr, As).</Paragraph> <Paragraph position="2"> (II) ?- abduce(sentence(S), \[showcase(s),lamp(1),loc(s,r),rightof(r,l)\], Pr, As).</Paragraph> <Paragraph position="3"> From the fact that queries like (11) are accepted it is clear that the abduction scheme we use is somewhat more generaithan predicate specific abduction: we supply information as to what literals may be assumed, whereas predicate specific abduction would only specify the functors and arities of those literals. It is well worth noting that on our approach generation is not simply the inverse of interpretation. If that were the case, one would call the abductive meta-interpreter with the goal instantiated deriving the assumed * literals during interpretation, while forgeneration the opposite instantiation pattern would be used. BUt for the latter case this amounts to requiring that all literals must be assumed in the proof, which is clearly too strong since some of them might be derivable from the knowledge base. Instead we only specify which literals may be assumed, leaving open the possibility that some of them are provable from the knowledge base.</Paragraph> <Paragraph position="4"> Also note that since we do not use weighted abduction, the problem of assigning different assumption costs for generation and interpretation (cf. \[Thomason et al. 97\]) is avoided. On the other hand, what should we use as a preference criterion? A sequence of several criteria is used. First, proofs are preferred for the number of provable literals used,the more the better. In the cases we consider, there, seems to be a loose correspondence between this criterion and the Gricean maxims of relevance and quality. Second, proofs are preferred compared to other proofs if they involve less assumptions. The number of assumptions made is determined by the cardinality of the set that is the reduction of the multiset foundduring an abductive proof. Third, everything, else being equal, we prefer proofs with the highest amount of assumption re-use. This is determined by the difference between the cardinalities of the multiset of assumptions and of its corresPonding set. The relevant idea--an assumption becomes more plausible if it is used to explain more than one thing~is essentially the same as the one behind the factoring rule of \[Stickel 90\].</Paragraph> <Paragraph position="6"> Now we are in a position to consider some examples involving the interaction of discourse pragmatics and syntax in German root clauses.</Paragraph> </Section> <Section position="2" start_page="112" end_page="112" type="sub_section"> <SectionTitle> 3.2 Generating Phrases in Initial Position &quot; </SectionTitle> <Paragraph position="0"> In a language like German with relatively free word order, any argument of the verbal head of a sentence * may appear first, depending on the relevance for the discourse. For the spatial scenarios we consider, there is almost always a choice between several noun phrases or prepositional phrases that can be arranged in almost any order, As seen before, elements referring to familiar entities usually precede phrases denoting things not mentioned before.</Paragraph> <Paragraph position="1"> Consider the case of locative verbs such as stehen ('to stand'), sich befinden ('to be located'), etc. We use the conventional Prolog translations of extended phrase structure rules to generate sentences headed by these verbs:</Paragraph> <Paragraph position="3"> np(syn(nom,_), X, PO,PI), ~, a nominative NP with discourse referent X loc_vp(X, R, PI,P2), ~, verb locating X inside the region R pp(R, P2,P). ~, a PP denoting a region R</Paragraph> <Paragraph position="5"> pp(R, P2,P).</Paragraph> <Paragraph position="6"> Suppose we want to generate the sentence Rechts vonder Lampe steht eine ~trine, a variation of (6). The propositional content of this sentence--lamp (1), righter (r, 11, loc (s ,r), showcase (s)--must be .assumable, and accessible (17 must be derivable from the listener model. Backward-chaining on v2_sentence(S, \[\] ) is not possible using rule (12) since accessible(s) is not provable and cannot be assumed. Rule (13) is applicable, but so is the weaker (14). Proofs involving these two rules will be equivalent except for the presence or absence of sub-proofs of accessible (r), which is derivable from accessible (11. But since proofs using more proved literals axe preferred, the best abductive proof will result in a sentence with an initial definite PP preceding both the verb and the indefinite subject NP.</Paragraph> </Section> <Section position="3" start_page="112" end_page="113" type="sub_section"> <SectionTitle> 3.3 Generating Hanging Topics </SectionTitle> <Paragraph position="0"> To model the &quot;topic shift&quot; signaled by hanging topics, we need some way to represent the currently active discourse referent. This is achieved by introducing two predicates, active (X) and activate (X), which test whether a given discourse referent is active or declare a discourse referent as active, respectively. For reasons of simplicity, we present these predicates as though they depend on a state external to the rules; in the implemented prototypes, each predicate is actually equipped with two additional variables that are used to drag along, test, and update the discourse State, in order to ensure a simple declarative semantics. In any proof, the literal activate (X) cannot be proved and has to be assumed, whereas active (X) must be</Paragraph> <Paragraph position="2"> resolved exactly once with the closest matching literal of the form activate (X). Thus the active referent is identified with the last activated one.</Paragraph> <Paragraph position="3"> * The rule for a sentence with a hanging topic can be seen in (15). Here it is not sufficient that the discourse * referent associated with the noun phrase be inferentially accessible, a stronger condition is imposed, the requirement that the discourse referent must be taken from a set of thematic referents presumably established in a superordinate planning stage,</Paragraph> <Paragraph position="5"> Since a sentence with a hanging topic is used to re-activate an inactive discourse referent, and since an NP may be realized as a pronoun if its discourse referent is active, sentences of this type usually contain a pronoun, rather than a full NP, that refers back to the hanging topic, as in (8) above.</Paragraph> </Section> <Section position="4" start_page="113" end_page="114" type="sub_section"> <SectionTitle> 3.4 Generating Left Dislocations </SectionTitle> <Paragraph position="0"> Left dislocation constructions involve a semantics beyond the first-order theories used so far. This construction type presupposes that some salient object other than the discourse referent of the dislocated constituent lacks the property predicated by the sentence. Since we are dealing with highly specific rules for sentences with locative verbs, it is possible to express these conditions without reference to negative properties. All we *have to do is to find some salient object distinct from the discourse referent of the dislocated NP, and a region where it is located distinct from the region in which the head verb locates the object denoted by the NP. In addition to this, the familiar discourse referent of the dislocated NP is made active. The resulting rule is displayed in (16):</Paragraph> <Paragraph position="2"> act ivate (X), &quot; pron (syn (nom, Gender), P I, P2), loc_vp(X, R1, P2,P3), pp(R1, P3,P).</Paragraph> </Section> <Section position="5" start_page="114" end_page="114" type="sub_section"> <SectionTitle> 3.5 An Example Proof </SectionTitle> <Paragraph position="0"> Finally, we consider in some detail an example proof that illustrates several of the techniques used in generating the syntactic forms discussed previously. The sentence we want to derive should express that a * showcase is located to the right of the lamp; additionally, we know that the immediately preceding discourse was about a different object, and some time ago we had explicitly mentioned the lamp.</Paragraph> <Paragraph position="1"> (17) facts: active(c); couch(c), thematic_referent(l), lamp(l), etc.</Paragraph> <Paragraph position="2"> assumable: showcase(s), lamp(1), loc(s,r), right_of(r,!), activate(.) to prove: * sentence (S, \[\] ) sentence (S, \[\] ) active (A) &quot;/, resolution with fact, A bound to c thematic_referent(Y) 7. resolution with fact, Y bound to 1 distinct_objects (c, 1) 7. provable from knowledge base ' C' (S, was, P 1) 7. provable from knowledge base 7. S bound to \[was I P1\]</Paragraph> <Paragraph position="4"> thematic_referent (i) 7. resolution with fact 'C' (P1, die, P2) 7. provable fromknowledge base 7. P1 bound to \[die IP2\] n(syn(acc,fem), 1, P2,P3) lamp (1) 7. resolution with fact 'C'(P2, lampe, P3) 7. provable from knowledge base 7. P2 bound to \[lampe I P3\] 'C!(P3, betrifft, P4) 7. provable from knowledge base % P3 bound to \[betrifft I P4\] activate (1) Y, can only be assumed</Paragraph> <Paragraph position="6"> thematic_referent (i) ~, resolution with a fact &quot; pp(r, P4,PG).</Paragraph> <Paragraph position="7"> active (W) activate (W) 7. factoring with a previous assumption, W bound to 1 right_of (r, 1) 7. factoring with a previous assumption 'C' (P4, rechts, P5) ~, provable from knowledge base 7, P4 bound to \[rechts \[ P5\] 'C' (P5, davon, P6) 7, provable from knowledge base 7. P5 bound to \[davonlP6\] loc_vp(X,: r, PG,P7) 7. can only be assumed, R bound to r, Z bound tO 1 * 115 'C'(P6, nteht, P7) 7.- provable from knowledge base 7. P6 bound to \[steht I PT\] loc(X, r) &quot; 7. can *only be assumed, X bound to s np(syn(nom,.), s, P7, \[\]) det (syn(nom,Gender), P7,P8) 7, Gender bound to fern * 'C ~ (P7, eine, P8) 7. provable from knowledge base 7. P7 bound to \[eine i P8\] n(syn(nom,fem), s, P8,\[\]) showcase (s) 7. can only be assumed * ' C' (P8, vitrine, \[\] ) 7. provable from knowledge base 7. P8 bound to \[vitrine\] .</Paragraph> <Paragraph position="8"> * result: S bound to \[was, die, lampe, betrifft ,rechts, davon, steht, eine ,vitrine\] assumed: activate (1), right_of (r,1), activate (1), right_of (r,1), loc (s,r), showcase(s) including two factored assumptions . . Our preference criteria favor this proof over competing ones, since many goals could be proved, few had to be *assumed, and assumptions could be re-used.</Paragraph> </Section> </Section> class="xml-element"></Paper>