File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/p92-1034_metho.xml
Size: 27,572 bytes
Last Modified: 2025-10-06 14:13:17
<?xml version="1.0" standalone="yes"?> <Paper uid="P92-1034"> <Title>USING CLASSIFICATION TO GENERATE TEXT</Title> <Section position="3" start_page="0" end_page="265" type="metho"> <SectionTitle> IDAS and I1 IDAS </SectionTitle> <Paragraph position="0"> IDAS is a natural-language generation system that generates on-line documentation and help messages for users of complex equipment. It supports user-tailoring and has a hypertext-like interface that allows users to pose follow-up questions.</Paragraph> <Paragraph position="1"> The input to IDAS is a point in question space, which specifies a basic question type (e.g., What-is-it), a component the question is being asked about (e.g., Computer23), the user's task (e.g. Replace-Part), the user's expertise-level *E-mail address is E. ReiterQed. ac .uk rE-mail address is C.NellishQed.ac.uk (e.g., Skilled), and the discourse in-focus list. The generation process in IDAS uses the three stages described in \[Grosz et al., 1986\]:</Paragraph> </Section> <Section position="4" start_page="265" end_page="269" type="metho"> <SectionTitle> * Content Determination: A content-determin- </SectionTitle> <Paragraph position="0"> ation rule is chosen based on the inputs; this rule specifies what information from the KB should be communicated to the user, and what overall format the response should use.</Paragraph> <Paragraph position="1"> * Text Planning: An expression in the ISI Sentence Planning Language (SPL) \[Kasper, 1989\] is formed from the information specified in the content-determination rule.</Paragraph> <Paragraph position="2"> * Surface Realisation: The SPL is converted into a surface form, i.e., actual words interspersed with text-formatting commands.</Paragraph> <Paragraph position="3"> I1 is the knowledge representation system used in IDAS to represent domain knowledge, grammar rules, lexicons, user tasks, user-expertise models, and content-determination rules. The I1 system includes: * an automatic classifier; * a default-inheritance system that inherits properties from superclass to subclass, using Touretsky's \[1986\] minimal inferential distance principle to resolve conflicts; * various support tools, such as a graphical browser and editor.</Paragraph> <Paragraph position="4"> An I1 knowledge base (KB) consists of classes, roles, and user-expertise models. User-expertise models are represented as KB overlays, in a similar fashion to the FN system \[Reiter, 1990\]. Roles are either definitional or assertional; only definitional roles are used in the classification process. Roles can be defined as having one filler or an arbitrary number of fillers, i.e., as having an inherent 'number restriction' of one or infinity.</Paragraph> <Paragraph position="5"> An I1 class definition consists of at least one explicitly specified parent class, primitive? and individual? flags, value restrictions for definitional roles, and value specifications for assertional roles. I1 does not support the more complex definitional constructs of KL-ONE, such as structural descriptions. The language for specifying assertional role values is richer than that for specifying definitional role value restrictions, and allows, for example: measurements that specify a quantity and a unit; references that specify the value of a role in terms of a KL-ONE type role chain; and templates that specify a parametrized class definition as a role value. The general design goal of I1 is to use a very simple definitional language, so that classification is computationally fast, but a rich assertional language, so that complex things can be stated about entities in the knowledge base.</Paragraph> <Paragraph position="6"> An example I1 class definition is: (define-class open-door : parent open : type defined : prop</Paragraph> <Paragraph position="8"> This defines the class Open-door to be a defined (non-primitive and non-individual) child of the class Open. Actor and Actee are definitional roles, so the values given for them in the above definition are treated as definitional value restrictions; i.e., an Open-Door entity is any Open entity whose Actor role has a filler subsumed by Animate-Object, and whose Actee role has a filler subsumed by Door.</Paragraph> <Paragraph position="9"> Decomposition is an assertional role, whose value is a list of three templates. Each template defines a class whose ancestor is an action (Grasp, Turn, Pull) that has the same Actor as the Open-Door action and that has an Actee that is the filler of the Part role of the Actee of the Open-Door action which is subsumed by Handle (i.e., (handle part) is a differentiation of Part onto Handle).</Paragraph> <Paragraph position="10"> For example, if Open-12 was defined as an Open action with role fillers Actor:Sam and Actee:Door-6, then Open-12 would be classified beneath Open-Door by the classifier on the basis of its Actor and Actee values. If an inquiry was issued for the value of Decomposition for Open12, the above definition from Open-Door would be inherited, and, if Door-6 had Handle-6 as one of its fillers for Part, the templates would be expanded into a list of three actions, (Grasp-12 Turn-12 Pull-12), each of which had an Actor of Sam and an Actee of Handle-6.</Paragraph> <Section position="1" start_page="265" end_page="266" type="sub_section"> <SectionTitle> Using Classification in Generation Content Determination </SectionTitle> <Paragraph position="0"> The input to IDAS is a point in question space, which specifies a basic question, component, usertask, user-expertise model, and discourse in-focus list. The first three members of this tuple are used to pick a content-determination rule, which specifies the information the generated response should communicate. This is done by forming a rule-instance with fillers that specify the basicquestion, component, and user-task; classifying this rule-instance into a taxonomy of content-rule classes, and reading off inherited values for various attributive roles. A (simplified) example of a content-rule class definition is: (define-class what-operat ions-rule :parent content-rule :type defined : prop</Paragraph> <Paragraph position="2"> roles that specify which queries a content rule applies to; What-Operations-Rule is used for &quot;What&quot; questions issued under an Operations task (for any component). Rule-Rolegroup specifies the role fillers of the target component that the response should communicate to the user; What-Operatlons-Rule specifies that the manufacturer, model-number, and colour of the target component should be communicated to the user.</Paragraph> <Paragraph position="3"> Rule-Functlon specifies a Lisp text-planning function that is called with these role fillers in order to generate SPL. Content-rule class definitions can also contain attributive roles that specify a human-readable title for the query; followup queries that will be presented as hypertext clickable buttons in the response window; objects to be added to the discourse in-focus list; and a testing function that determines if a query is answerable.</Paragraph> <Paragraph position="4"> Content-determination in IDAS is therefore done entirely by classification and feature inheritance; once the rule-instance has been formed from the input query, the classifier is used to find the most specific content-rule which applies to the ruleinstance, and the inheritance mechanism is then used to obtain a specification for the KB informa~ tion that the response should communicate, the text-planning function to be used, and other relevant information.</Paragraph> <Paragraph position="5"> IDAS's content-determination system is primarily designed to allow human domain experts to relatively easily specify the desired contents of short (paragraph or smaller) responses. As such, it is quite different from systems that depend on deeper plan-based reasoning (e.g. \[Wahlster et al., 1991; Moore and Paris, 1989\]). Authorability is stressed in IDAS because we believe this is the best way to achieve IDAS'S goal of fairly broad, but not necessarily deep, domain coverage; short responses are stressed because IDAS's hypertext interface should allow users to dynamically choose the paragraphs they wish to read, i.e., perform their own high-level text-planning \[Reiter et al., 1992\].</Paragraph> </Section> <Section position="2" start_page="266" end_page="267" type="sub_section"> <SectionTitle> Text Planning </SectionTitle> <Paragraph position="0"> Text planning is the only part of the generation process that is not entirely done by classification in IDAS, The job of IDAS'S text-planning system is to produce an SPL expression that communicates the information specified by the content-determination system. This involves, in particular: null * Determining how many sentences to use, and what information each sentence should communicate (text structuring).</Paragraph> <Paragraph position="1"> * Generating referring expressions that identify domain entities to the user.</Paragraph> <Paragraph position="2"> * Choosing lexical units (words) to express domain concepts to the user.</Paragraph> <Paragraph position="3"> Classification is currently used only in the lexical-choice portion of the text-planning process, and even there it only performs part of this task.</Paragraph> <Paragraph position="4"> Text structuring in IDAS is currently done in a fairly trivial way; this could perhaps be implemented with classification, but this would not demonstrate anything interesting about the capabilities of classification by generation. More sophisticated text-structuring techniques have been discussed by, among others, Mann and Moore \[1981\], who used a hill-climbing algorithm based on an explicit preference function. We have not to date investigated whether classification could be used to implement this or other such text-structuring algorithms.</Paragraph> <Paragraph position="5"> Referring expressions in IDAS are generated by the algorithm described in \[Reiter and Dale, 1992\]. This algorithm is most naturally stated iteratively in a conventional programming language; there does not seem to be much point in attempting to re-express it in terms of classification.</Paragraph> <Paragraph position="6"> Lexical choice in IDAS is based on the ideas presented in \[Reiter, 1991\]. When an entity needs to he lexicalized, it is classified into the main domain taxonomy, and all ancestors of the class that have lexical realisations in the current user-expertise model are retrieved. Classes that are too general to fulfill the system's communicative goal are rejected, and preference criteria (largely based on lexical preferences recorded in the user-expertise model) are then used to choose between the remaining lexicalizable ancestors.</Paragraph> <Paragraph position="7"> For example, to lexicalize the action (Activate with role fillers Actor:Sam and Actee:Toggle-Switch-23) under the Skilled user-expertise model, the classifier is called to place this action in the taxonomy. In the current IDAS knowledge base, this action would have have two realisable ancestors that are sufficiently informative to meet an instructional communicative goal, 1 Activate (realisation &quot;activate&quot;) and (Activate with role filler Actee:Switch) (realisation &quot;flip&quot;). Preference criteria would pick the second ancestor, because it is marked as basic-level \[Rosch, 1978\] in the Skilled user-expertise model. Hence, if &quot;the switch&quot; is a valid referring expression for Toggle-Swltch-23, the entire action will be realised as &quot;Flip the switch&quot;.</Paragraph> <Paragraph position="8"> In short, lexical-choice in IDAS use8 classification to produce a set of possible lexicMizations, but other considerations are used to choose the most appropriate member of this set. The lexical-choice system could be made entirely classification-based if it was acceptable to always use the most specific realisable class that subsumed an entity, but ignoring communicative goals and the user's preferences in this way can cause inappropriate text to be generated \[Reiter, 1991\].</Paragraph> <Paragraph position="9"> In general, it may be the case that an entirely classification-based approach is not appropriate for tasks which require taking into consideration complex pragmatic criteria, such as the user's lexical preferences or the current discourse context (classification may still be usefully used to perform part of these tasks, however, as is the case in IVAS's lexical-choice module). It is not clear to the authors how the user's lexical preferences or the discourse context could even be encoded in a manner that would make them easily accessible to a classifier-based generation algorithm, although perhaps this simply means that more research needs to be done on this issue.</Paragraph> <Paragraph position="10"> cestor class that is too general to meet the communicative goal; if the user is simply told &quot;Perform an action on the switch&quot;, he will not know that he is supposed to activate the switch.</Paragraph> </Section> <Section position="3" start_page="267" end_page="268" type="sub_section"> <SectionTitle> Surface Realisation </SectionTitle> <Paragraph position="0"> Surface realisation is performed entirely by classification in IDAS. The SPL input to the surface realisation system is interpreted as an I1 class definition, and is classified beneath an ,pper model \[Bateman et al., 1990\]. The upper model distinguishes, for example, between Relational and Nonrelational propositions, and Animate and Inanimate objects. 2 A new class is then created whose parent is the desired grammatical unit (typically Complete-Phrase), and which has the SPL class as a filler for the definitional Semantics role.</Paragraph> <Paragraph position="1"> This class is classified, and the realisation of the sentence is obtained by requesting the value of its Realisatlon role (an attributive role).</Paragraph> <Paragraph position="2"> A simplified example of an I1 class that defines or another attributive role value, this can be specified by creating a child of Sentence that uses II's default inheritance mechanism to selectively override the relevant role fillers. For example, This defines a new class Imperative that applies to Sentences whose Semantics filler is classifted beneath Command in the upper model (Command is a child of Predication). This class inherits the values of the Number and Sub-ject fillers from Sentence, but specifies a new filler for Realisation, which is just the Realisation of the Predicate of the class. In other words, the above class informs the generation system of the grammatical fact that imperative sentences do not contain surface subjects. The classification system places classes beneath their most specific parent in the taxonomy, so to-be-realised classes always inherit realisation information from the most specific grammatical-unit class that applies to them.</Paragraph> <Paragraph position="3"> The Role of Conflict Resolution In general terms, a classification system can be thought of as supporting a pattern-matching process, in which the definitional role fillers of a class represent the pattern (e.g. (semantics command) in Imperative), and the attributive roles (e.g., R.ealisation) specify some sort of action. In other words, a classification system is in essence a way of encoding pattern-action rules of the form:</Paragraph> <Paragraph position="5"> If several classes subsume an input, then classification systems use the attributive roles specified (or inherited by) the most specific subsuming class; in production rule terminology, this means that if several c~i's match an input, only the ~i associated with the most specific matching crl is triggered. In other words, classification systems use the conflict resolution principle of always choosing the most specific matching pattern-action rule.</Paragraph> <Paragraph position="6"> This conflict-resolution principle is used in different ways by different parts of \]DAS. The content-determination system uses it as a preference mechanism; if several content-determination rules subsume an input query, any of these rules can be used to generate a response, but presumably the most appropriate response will be generated by the most specific subsuming rule. The lexical-choice system, in contrast, effectively ignores the 'prefer most specific' principle, and instead uses its own preference criteria to choose among the lexemes that subsume an entity. The surface-generation system is different yet again, in that it uses the conflict-resolution mechanism to exclude inapplicable grammar rules. If a particular term is classified beneath Imperative, for example, it also must be subsumed by Sentence, but using the Realisation specified in Sentence to realise this term would result in text that is incorrect, not just stylistically inferior.</Paragraph> <Paragraph position="7"> The 'use most specific matching rule' conflict-resolution principle is thus just a tool that can he used by the system designer. In some cases it can be used to implement preferences (as in IDAS's content-determination system); in some cases it can be used to exclude incorrect rules which would cause an error if they were used (as in IDAS's surface-generation system); and in some cases it needs to be overridden by a more appropriate choice mechanism (as in IDAS's lexical choice system). null</Paragraph> </Section> <Section position="4" start_page="268" end_page="268" type="sub_section"> <SectionTitle> Classification vs. Other Approaches </SectionTitle> <Paragraph position="0"> Perhaps the most popular alternative approaches to generation are unification (especially functional unification) and systemic grammars. As with classification, the unification and systemic approaches can be applied to all phases of the generation process \[McKeown et al., 1990; Patten, 1988\]. 3 However, most of the published work on unification and systemic systems deals with surface realisation, so it is easiest to focus on this task when making a comparison with classification systems.</Paragraph> <Paragraph position="1"> Like classification, unification and systemic systems can be thought of as supporting a recursive pattern-matching process. All three frameworks allow grammar rules to be written declaratively.</Paragraph> <Paragraph position="2"> They also all support unrestricted recursion, i.e., they all allow a grammar rule to specify that a constituent of the input should be recursively processed by the grammar (IDAS does this with II's template mechanism). In particular, this means that all three approaches are Turing-equivalent.</Paragraph> <Paragraph position="3"> There are differences in how patterns and actions are specified in the three formalisms, but it is probably fair to say that all three approaches are sufficiently flexible to be able to encode most desirable grammars. The choice between them must therefore be made on the basis of which is easiest to incorporate into a real NL generation system.</Paragraph> <Paragraph position="4"> 3Although it is unclear whether unification or systemic systems can do any better at the text-planning tasks that are difficult for classification systems, such as generating referring expressions.</Paragraph> <Paragraph position="5"> We believe that classification has a significant advantage here because many generation systems already include a classifier to support reasoning on a domain knowledge base; hence, using classification for generation means the same knowledge representation (KR) system can be used to support both domain and linguistic knowledge. Thus, IDAS uses only one KR system -- I1 -- whereas systems such as COMET (unification) \[McKeown et al., 1990\] and PENMAN (systemic) \[Penman Natural Language Group, 1989\] use two different KR systems: a classifier-based system for domain knowledge, and a unification or systemic system for grammatical knowledge.</Paragraph> </Section> <Section position="5" start_page="268" end_page="269" type="sub_section"> <SectionTitle> Unification Systems </SectionTitle> <Paragraph position="0"> The most popular unification formalism for generation up to now has probably been functional unification (FUG) \[Kay, 1979\]. FUG systems work by searching for patterns (alternations) in the grammar that unify with the system's input (i.e., unification is used for pattern-matching); inheriting syntactic (output) feature values from the grammar patterns (the actions); and recursively processing members of the constituent set (the recursion). That is, pattern-action rules of the above kind are encoded as something like: v v ...</Paragraph> <Paragraph position="1"> If a unification system is based on a typed feature logic, then its grammar can include classificationlike subsumption tests \[Elhadad, 1990\], and thus be as expressive in specifying patterns as a classification system.</Paragraph> <Paragraph position="2"> An initial formal comparison of unification with classification is given in the Appendix. Perhaps the most important practical differences are: * Classification grammars cannot be used bidirectionally, while unification grammars can \[Sheiber, 1988\].</Paragraph> <Paragraph position="3"> * Unification systems produce (at least in principle) all surface forms that agree (unify) with the semantic input; classification systems produce a single surface form output.</Paragraph> <Paragraph position="4"> These differences are in a sense a result of the fact that unification grammars represent general mappings between semantic and surface forms (and hence can be used bidirectionally, and produce all compatible surface forms), while classification systems generate a single surface form from a semantic input. In McDonald's \[1983\] terminology, classification-based generation systems deterministically and indelibly make choices about alternate surface-form constructs as the choices arise, with no backtracking; 4 unification-based systems, 4McDonald claims, incidentally, that indelible decision-making is more plausible than backtracking from a psycholinguistic perspective.</Paragraph> <Paragraph position="5"> in contrast, produce the set of all syntactically correct surface-forms that are compatible with the semantic input. 5 In practice, all generation systems must possess a 'preference filter' of some kind that chooses a single output surface-form from the set of possibilities. In unification approaches, choosing a particular surface form to output tends to be regarded (at least theoretically) as a separate task from generating the set of syntactically and semantically correct surface forms; in classification approaches, in contrast, the process of making choices between possible surface forms is interwoven with the main generation algorithm.</Paragraph> <Paragraph position="6"> Systemic approaches Systemic grammars \[Halliday, 1985\] are another popular formalism for generation systems. Systemic systems vary substantially in the input language they accept; we will here focus on the NIGEL system \[Mann, 1983\], since it uses the same input language (SPL) as IDAS'S surface realisation system, s Other systemic systems (e.g., \[Patten, 1988\]) tend to use systemic features as their input language (i.e., they don't have an equivalent of NIGEL'S chooser mechanism), which makes comparisons more difficult.</Paragraph> <Paragraph position="7"> NIGEL works by traversing a network of systems, each with an associated chooser. The choosers determine features, by performing tests on the semantic input. Choosers can be arbitrary Lisp code, which means that NIGEL can in principle use more general 'patterns' in its rules than IDAS can; in practice it is not clear to what extent this extra expressive power is used in NIGEL, since many choosers seem to be based on subsumption tests between semantic components and the system's upper model. In any case, once a set of features has been chosen, these features trigger gates and their associated realisation rules; these rules assert information about the output text. From the pattern-matching perspective, choosers and gates provide the patterns ai of rules, while realisation rules specify the actions 13i to be performed on the output text.</Paragraph> <Paragraph position="8"> Like classification systems (but unlike unification systems), systemic generation systems are, in McDonald's terminology, deterministic and indelible choice-makers; NmEL makes choices about 50f course these differences are in a sense more theoretical than practical, since one can design a unification system to only return a single surface form instead of a set of surface forms, and one can include backtracking-like mechanisms in a classification-based system.</Paragraph> <Paragraph position="9"> SStrictly speaking, SPL is an input language to PEN-MAN, not NIGEL; we will here ignore the difference between PENMAN and NIGEL.</Paragraph> <Paragraph position="10"> alternative surface-form constructs as they arise during the generation process, and does not backtrack. Systemic generation systems are thus probably closer to classification systems than unification systems are; indeed, in a sense the biggest difference between systemic and classification systems is that systemic systems use a notation and inference system that was developed by the linguistic community, while classification systems use a notation and inference system that was developed by the AI community.</Paragraph> <Paragraph position="11"> Other Related Work RSsner \[1986\] describes a generation system that uses object-oriented techniques. SPL-like input specifications are converted into objects, and then realised by activating their To-Realise methods.</Paragraph> <Paragraph position="12"> RSsner does not use a declarative grammar; his grammar rules are implicitly encoded in his Lisp methods. He also does not use classification as an inference technique (his taxonomy is hand-built).</Paragraph> <Paragraph position="13"> DATR \[Evans and Gazdar, 1989\] is a system that declaratively represents morphological rules, using a representation that in some ways is similar to I1. In particular, DATR allows default inheritance and supports role-chain-like constructs. DATR does not include a classifier, and also has no equivalent of II's template mechanism for specifying recursion.</Paragraph> <Paragraph position="14"> PSI-KLONE \[Brachman and Schmolze, 1985, appendix\] is an NL understanding system that makes some use of classification, in particular to map surface cases onto semantic cases. Syntactic forms are classified into an appropriate taxonomy, and by virtue of their position inherit semantic rules that state which semantic cases (e.g., Actee) correspond to which surface cases (e.g., Object).</Paragraph> <Paragraph position="15"> Conclusion In summary, classification can be used to perform much of the necessary processing in natural-language generation, including contentdetermination, surface-realisation, and part of text-planning. Classification-based generation allows a single knowledge representation system to be used for both domain and linguistic knowledge; this means that a classification-based generation system can have a significantly simpler overall architecture than a unification or systemic generation system, and thus be easier to build and maintain. null</Paragraph> </Section> </Section> class="xml-element"></Paper>