File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-2088_metho.xml
Size: 40,110 bytes
Last Modified: 2025-10-06 14:12:05
<?xml version="1.0" standalone="yes"?> <Paper uid="C88-2088"> <Title>STRATEGIES FOR EFFECTIVE PARAPHRASING</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> STRATEGIES FOR EFFECTIVE PARAPHRASING </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> USA AB S'll'RAC'I' </SectionTitle> <Paragraph position="0"> in this paper we present a new dimension to paraphrasing text in which characteristics of the original text motivate strategies for effective pacaphrasing. Our system combines two existing robust components: the IRIJS-.II natural language underst~mding system and the SPOKESMAN generation system. We describe the architectur(: of the system and enhancements made to these components to facilitate paraphrasing. We particularly look at how levels of representation in these two systems are used by specialists in the paraphraser which define potential problems and paraphrasing strategies. Finally, we look at the role of paraphrasing in a cooperative dialog system. We will focus here on paraphrasing in the coutext of natural language interfaces and particularly on how multiple int::rpretations introduced by various kinds of ambiguity can be contrasted in paraphrases using both sentence structure and highlighting and folmating the text itself.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 1. ~\[NTRODUCTION l </SectionTitle> <Paragraph position="0"> While tecimieally paraphrasing is simply the task of restating the meaning of a text in a different form, it is crucial to consider the purpose of the paraphrase in order to motivate particular strategies for changinl: the text. If the t)oint of the paraphrase is to clarify the original texi, its in a natural language (NL) interface to a database (DB) or cx~rt system application, then disambiguating the que W and choosing more precise lcxical items (perhaps closer to the structure of the actual Dt3, expert system, o1' other underlying application) are essential strategies. If the point is to summarize information, then strategies for evaluating the relative importance of the information presenlcd in the text m'e necessary. If the point is merely to re:;tate the text ~t_.f~r~l lil2 than the original, perhaps merely to exercise the system, then one must use strategies which consider what structures and lexical items were actually found by the parser.</Paragraph> <Paragraph position="1"> Oar motivation for work on strategies for effective paraphrasing comes front the recent availablility of NI, interfaces as commercial products. As the underlying systems that a NL interface must interact with increase in number and sophistication, the range of NL interactions will increase as well. Paraphrasers developed in the past (e.g. McKeown's Co-op and BBN's Parlance'rMNL Interface) were all limited in that each used only a single strategy for paraphrasing regardless of what problems may have been present in the original query. (We diseussthese systems in detail in Section 6.) Our approach is to develop a variety of strategies which may be employed in different situations. We introduce a new dimension to paraphrasing text in which characteristics of the original text plus the overall context (inch~ding the goal of the system) mofiwtte strategies for effective paraphrasing.</Paragraph> <Paragraph position="2"> Our focus here will be on paraphrasing anlbiguous queries in an interactive dialog system, whc~e conlrasting nmltiple interpretations is essential. In order to ground our discussion, we first look briefly at a range of ambiguity types. We then provide an overview of the architecture and description of the two major components: the IRUS-II'rM mlderstanding system and the Spokesman generation system. We look closely at the aspects of these systems that we augmented t0r the paraphrasiug task and provide a detailed example of how the system appreciates multiple interpretations and uses that information to govern decision making in generation. Next we discuss the role of paraphrasing in a cooperative dialog system, and in the final section we conta'ast our approach with other work in paraphrasing.</Paragraph> <Paragraph position="3"> I We would like to Ihank Lance Ramshaw tot' his invaluable help in understanding die inner workings of RUS and suggestions of where it could be augmented for out' purposes, and Dawn MacLaughlin for her implementation of Pal~rot, the init~d versio, of our paraphraser. We would also like tx) thank Ralph Weisclmdel, D~mafis Ayuso, and David iglcDonald for their helpful comments of d~afl.s of this paper and Lya Bates tot early inspirations.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. PROBLEMS AND STRATEGIES </SectionTitle> <Paragraph position="0"> Ambiguity is one of the more difficult problems to detect and correct. In this section we look at three kinds of ambiguity: lexical, structural and contextual, and discuss potential strategies a paraphraser might use to eliminate the ambiguity.</Paragraph> <Paragraph position="1"> 1) LEXICAL AMBIGUITIES ale introduced when a lexical item can refer to more than one thing. In the following example &quot;Manhattan&quot; can refer to either the borough of New York City or the ship: Wtutt is the latitude arm longitude of Manhattun? The paraphraser must appreciate the ambiguity of that noun phrase, decide how to disambiguate it, and decide how much of the context to include in the paraphrase. One strategey would be to repeat the entire query, disambiguating the noun phrase by using the type and name of the object: Do you mean what & the latitude atul longitude of the city Manhattan or what is the latitude and longitude of the ship Manhattan? However, if the query is long, the result could be quite cumbersome. A different strategy, highlighting and formatting the text to contrast the differences, can serve to direct the user's attention to the part that is ambiguous: Do you mean list the latitude and longitude of the city Manhattan or the ship Manhattan? 2) STRUCTURAL AMBIGUITIES are caused when there are multiple parses for a sentence. Conjunction is a typical source of structural ambiguity. Modifiers of conjoined NPs may distribute over each NP or modify only the closest NP. Consider, for example, the followfi~g query: Display the forested lu'lLv and rivers.</Paragraph> <Paragraph position="2"> This query has only one interpretation in which the premodifier &quot;forested&quot; modifies only the noun &quot;hills&quot;. In contrast, the following query has two interpretations: Display the C1 carriers and frigates In one interpretation, the premodifier &quot;CI&quot; may apply only to the noun &quot;carrier&quot;; in the other, &quot;CI&quot; applies to both &quot;carriers&quot; and &quot;frigates&quot;. Each interpretation requires a different paraphrase strategy. In the case where the premodifier distributes, the ambiguity may be eliminated by repeating the modifier: Disl)lay the C1 carr&rs and C1 frigates. When it does not distribute, there are three potential slxategies: --changing the order of the conjuncts: Display the frigates and C1 carr&rs.</Paragraph> <Paragraph position="3"> --hatrodueing explicit quantifiers: Display the C1 carriers and all the frigates.</Paragraph> <Paragraph position="4"> --moving premodifiers to postmodifiers: Display the carriers which are C1 arm the frigates.</Paragraph> </Section> <Section position="5" start_page="0" end_page="431" type="metho"> <SectionTitle> 3) CONTEXTUAL AMBIGUITIES are introduced when the query is </SectionTitle> <Paragraph position="0"> underspecified for the underlying system it is working with. For example if the context includes a map and the possibility of natural language or table output, the query Which carriers are C1? could mean either list or display.</Paragraph> <Paragraph position="1"> This work was supported by the Strategic Computing Program, DARPA contract munber N000014-85-C-00016.</Paragraph> </Section> <Section position="6" start_page="431" end_page="434" type="metho"> <SectionTitle> 3. ARCHITECTURE </SectionTitle> <Paragraph position="0"> As tile examples above illustrate, the information needed to notice problems such as ambiguity in a query is quite varied, and the strategies needed to generate a motivated paraphrase must be employed at various levels in the generation process. A distinguishing feature of our system is that it works in cooperation with existing understanding and generation components and allows the paraphraser access to multiple levels of their processing. This multilevel design allows the understanding system to appreciate ambiguities and vagueness at lexical, structural, and contextual levels, and the generation system to &quot;affect the text's organization, syntactic structure, lexical items and even to format and highlight the final text.</Paragraph> <Paragraph position="1"> Figure 1 shows an overview of the architecture of the system.</Paragraph> <Paragraph position="2"> In this section, we first describe the understanding and generation systems independently, focusing on how the Problem Recognizers and Paraphrasing Strategies have been incorporated into the components. We then look at the paraphraser itself and how it evolved.</Paragraph> <Section position="1" start_page="431" end_page="431" type="sub_section"> <SectionTitle> 3.1 THE UNDERSTANDING COMPONENT: </SectionTitle> <Paragraph position="0"> IRUS-II(TM) IRUS-Iltm (Weischedel, et al. 1987) is a robnst NL understanding system that interfaces to a variety of underlying systems, such as DB management systems, expert systems and other application programs. It is capable of handling a very wide range of English constructions including ill-folaned ones. IRUS-II has two major processing levels which distinguish the linmfistic processing from the details of the particular underlying sy~ems it is used with. The first level, the &quot;Front End&quot;, integrates syntactic and semantic processing. The major domain-independent &quot;Front End&quot; modules include a parser and associated grammar of English, a semantic interpreter, and a subsystem for resolving anaphora and ellipsis. These modules simultaneously parse an English text into a syntactic structural description and construct a formal semantic representation of its meaning in a higher order intensional logic language called the World Model Language (WML). The syntactic processor is the RUS Parser/Grammar which is based on the ATN formalism. Constants in the WML are concepts and predicates from a hierarchical domain model represented in NIKL (Moser 1983).</Paragraph> <Paragraph position="1"> The more domain-dependent modules of the Front End are the lexicon, domain model, and a set of semantic Interpretation Rules (IRules). 'The lexicon contains information about parts of speech, and syntactic and morphological features needed for parsing, and word and phrase substitutes (such as abbreviations). An IRule defines, for a word or (semantic) class of words, the semantically acceptable English phrases that can occm' having that word as a head of the phrase, and in addition defines the semantic interpretation of an accepted phrase. Thus, when tile parser proposes (i.e., TRANSMITs) an intermediate syntactic phrase structure, the semantic interpreter uses the mules that are associated with the head of that phrase to determine whether the proposed structure is interpretable and to specify its interpretation. Since semantic processing is integrated with syntactic processing, the 1Rules serve to block a semantically anomalous phrase as soon as it is proposed by the parser. The semantic representation of a phrase is constructed only when the phrase is believed complete.</Paragraph> <Paragraph position="2"> The task of the &quot;Back End&quot; component of 1RUS-II is to take a WML expression and compute the correct command or set of commands to one or more underlying systemsin order to obtain the result requested by tile user. This problem is decomposed into the following steps: * The WML expression is simplified and then gradually translated into the Application System Interface Langauge (ASlL).</Paragraph> <Paragraph position="3"> * The particular underlying system or systems that need to be accessed are identified.</Paragraph> <Paragraph position="4"> * The ASIL is transformed into underlying system(s) code to execute the query.</Paragraph> <Paragraph position="5"> While the constants in WML and ASIL are domain-dependent, the constants in ASIL-to-code translation system(s) code are both domain dependent and underlying-system dependent.</Paragraph> <Paragraph position="6"> In this section, we briefly describe how various kinds of ambiguities are currently handled in IRUS-II. There are at least the following kinds of ambiguities that may occur in natural language: Semantie ambiguity (lexical, phrasal, referring expressions), structural ambiguity, quantifier scope ambiguity and collective reading ambiguity. In cases of semantic ambiguity, multiple WMLs are generated from the same syntactic parse path. For example, when a word (e.g., &quot;Manhattan&quot;) belongs to more than one semantic class in the domain model (e.g, CITY, VESSEL), two WMLs are generated from the same syntactic parse path, each referring to a different semantic class. Similm'ly, premodified nouns (e.g., &quot;Hawaii ships&quot;) generate multiple WMLs, each created as a result of multiple IRules assigning several interpretations to the relation between the elements (e.g., &quot;Ships whose home port is Hawaii&quot;, &quot;Ships whose destination is Hawaii&quot;, or &quot;Ships whose current location is Hawaii&quot;).</Paragraph> <Paragraph position="7"> Strnctu~al ambiguities are caused by mulliple syntactic , interprcta~ioas and result i, alternative parse paths in the RUS parser/grammar. IRUS.II identifies these ambiguities by S(xluendally attempting to pm~e file text, with each attempt following a different parse path. Note in these cases each syntactic parse path nmy also have multiple semantic interpretmious.</Paragraph> <Paragraph position="8"> 3Jo3 )t',nhance~nenk~ to \]\[RIJSo\]\[I for Effective Pa raplh~oa,,~ng 'lhougb \[ILliS41 ~ pmdnces multiple inteq)letations (WMLs) for a variety of ambiguous sentences~ it was not originally designed with the intent of paraphrasing those interpretations. While each individual WML could be paraphrased separately, a more useflll approach would be to combine closely related interpretations into a single paraphrase that highlights the contrasts between the interpretations. The need to keep associations between multiple :interpretations motivated file lollowing enhmmements to the IRUS--II system: * P~'cd~fined ambiguity specialists that detect and annotate potel~tial problems presented by the input text are &quot;distributed&quot; in the parser/grammar and the semantic interpreter. For example, when the parser TRANSMITs the phras,: &quot;Manhattan&quot; to the semantic interpreter as a head of a NI?, two semm~tic classes, CITY and VESSEL, will be asst~;iaied with that NP. At this point, the Lexical Ambiguity Specialist records the lexieal item &quot;Manhattan&quot; as the ambiguity soume mid the two different classes.</Paragraph> <Paragraph position="9"> * After recording the potential ambiguity source, each ambiguity specialist monitors a prcdefined sequence of TRANSMITs associated with that source, and records the difl:en ~nt intermediate WML expressions resulting from these TRANSMfYs. For exmnple, the Lexical Ambiguity Specialist xm~nitors the TRANSMITs of &quot;Manhatten&quot; as a head noun of the NP. Ill ibis case, there will be two applicable 1Rules, one defining &quot;Marthattan&quot; as a CITY attd the other defining &quot;Manhattan&quot; as a VESSEI. Both interpretations are scmal~tically acceptable, resulting in two intermediate WMLs, which are then recorded by tile specialist. Upon completion of the inlntt text, two WMLs will be created and this record will I~ used to annotate them with their respective differences that resulted fi'om a common ambiguity source.</Paragraph> <Paragraph position="10"> 'We look at the details of the specialists on one particular example in Section 4, The Spokesman gcnetation system also has two major components: a text planner and a linguistic realization component, MUMBLE4t6 (Mercer et al. 1987). Both components are built within the framework of &quot;multilevel, description directed control&quot; (McDonald 1983). In this framework, decisions are organized into levels according to the kind of reference knowledge brought to beat&quot; (e.g. event or argmnent structure, syntactic structnre, morphology). ,At each level, a representation of the utterance is constructed which \]both captures the decisions made so ~ar and constrains the future decision inaldng. The l~p~esentation at each level also serves as rite control lot the mapping to the next level ~ff representation.</Paragraph> <Paragraph position="11"> The text plmmcr must establish what information the utterance its to include and what wording and organization it must have in order to insore that the information is understood with the intended perspectives. The intermediate level of representation in this conlponent is tile text strt~cture, which is a tree-like representation ,of the orgma~zation of discourse level constituents. The stntcture is populated with model level objects (i.e. ti'om the applications program) and &quot;discourse objects&quot; (compositional objects created for 1the particulac utterance) and the ~elations between these objects. The text strnctar~ is extended incrementally in two ways: 1) expanding nodes whose contents are composite objects by using predefined templates associated with the object types (such as expanding an &quot;event&quot; object by making its arguments subnodes); 2) adding units into the slfuctuw at new n(xles. The units may be selected li'om an already positioned composite unit or they may be individuals handed m the orcheslrator by an independently ch'ivcn selection process.</Paragraph> <Paragraph position="12"> Once the text structure is complete, it is traversed &;pth first beginning with file root node. At each node, the mapping process chooses the linguistic resource (lexical item, syntactic relation such as restrictive modifim, etc.) that is to realize the object which is the content of that node. Templates associated with these objects define the set of possibilities and provide procedures for building its portion of tile next level of representation, the &quot;message level&quot;, which is the input specification for the linguistic realization component, MUMBLE-86.</Paragraph> <Paragraph position="13"> The input specification to MUMBLE-86 specifies what is to be said and constrains how it is to be said. MUMBLE-86 handles the realization of the elements in the input specification (e.g. choosing between the ships ate assigned, which are assigned, or assigned depending on whether the linguistic context requires a fldl clause, postmodifier, or premodifier), the positioning of elements in the text (e.g. choosing where to place an adverbial phrase), and the necessary morphological operations (e.g. subject-verb agreement). In order to make these decisions, MUMBLE-86 maintains an explicit representation of the linguistic context in the form of an ~mnotated surface structure. Labels on positions provide both syntactic constraints for choosing the appropriate phrase and a definition of which links may be broken to add more structure. This structure is traversed depth first as it is built, guiding the further realization of embedded elements and the attachment of new elements. When a word is reached by the traversal process, it is sent to the morphology process, which uses the lingusitic context to execute the appropriate morphological operations. Then the word is passed to the word stream to be output and the traversal process continues through the surface structure.</Paragraph> </Section> <Section position="2" start_page="431" end_page="431" type="sub_section"> <SectionTitle> 3.3 Parrot and Polly </SectionTitle> <Paragraph position="0"> Our first implementation of the paraphraser was simply a parrot which used the output of the parser (tile WML) as input to tile generator. The text planner in this case consists of a set of translation flmctions which build text structure and populate it wilh eoml)osite objects built from WML subexpressions and the constants in the WML (concepts and roles from IKUS-II's hierarchical domain model). The translation to text structure uses both explicit and implicit information fiom the WML. The first operator in a WML represents the speech act of the utterance. Fo* example, BRING-ABOUT indicates explicitly that the matrix clause should be a command and implicitly that it should be in the present tense and the agent is the system. The IOTA operator indicates that the reference is definite and POWER indicates it is plural.</Paragraph> <Paragraph position="1"> A second set of templates map these objects to the input specification for the linguistic component, determining the choice of lexical heads, argument structm'es, and attachment relations (such as restrictive-modifier or clausal-adjunct).</Paragraph> <Paragraph position="2"> Interestingly, PARROT turned out to be a conceptual parrot, rather than a verbatim one. For example, the phrase the bridge on the river is interpreted as the following WML expression. The domain model predicate CROSS represents the role between bridge and river since IRUS interprets &quot;on&quot; in this particular context in terms of the CROSS 1elation: (IOTA JX 124 BRIDGE (CROSS JX 124 (IOTA JX236 RIVER))) This is &quot;parroted&quot; as the bridge which crosses the river. While in some cases this direct translation of the WML produces an acceptable phrase, in other cases the results are less desirable. For example, named objects are represented by an expression of the form (IOTA van type (NAME vat none)), which, tremslated directly, would produce the river which is named Hudson. Such phrases make the generated text unnecessarily cumbersome. Our solution in PARROT was to implement an optimization at the point when the complex object is built and placed in the text structure that uses the name as tile head of the complex object rather than the type.</Paragraph> <Paragraph position="3"> (Melish, 1987, discusses similar optimizations in generating from plans.) While PARROT allowed us to establish a link from text in to text out, it is clear this aioproach is insufficient to do more sophisticated paraphrasing. POLLY, as we call our &quot;smart&quot; 1)araphraser, takes advantage of the extra information provided by IRUS-II in order to control the decision making in generation.</Paragraph> <Paragraph position="4"> One of the most common places in which the system must choose carefully which realization to use is when tile input is ambiguous and the paraphrase must contrast the two meanings. For example, if a semantic ambiguity is caused by an ambiguous name, as in Where is Diego Garcia (where Diego Garcia is both a submarine and a port), the type information must be included in the paraphrase: Do you mean where is the port Diego Garcia or the submarine Diego Garcia.</Paragraph> <Paragraph position="5"> Note, with the optimization of PARROT described above, this sentence could not be disamiguated.</Paragraph> <Paragraph position="6"> In order to generate this paraphrase contrasting the two interpretations, the system needs to know what part is ambiguous at two different points in the generation process: in the text planner when selecting the information to include (both the type and the name) and at the final stage when the text is being output (to change the font). Our use of explicit active representations allows the system to mark the contrast only once, at the highest level, the text structure. This constraint is then passed through the levels and can affect decisions at any of the lower levels. Thus the system makes use of the information provided by the understanding system when it is available and ensures it will still be available when needed and won't be considered in parts of the utterance where it is not relevant. 4. Paraphrasing Syntactic Ambiguities - an Example To elucidate the description above, we will return to an earlier example of a query with an ambiguous conjunction construction: Display all carriers and frigates in the Indian Ocean. This sentence has two possible interpretations: 1) Display all carriers in the Indian Ocean and all frigates in the Indian Ocean.</Paragraph> <Paragraph position="7"> 2) DLplay all frigates in the Indian Ocean at~t all the carriers. In this example we show (1) how the Problem Recognizers discover that there are two interpretations and what the particular differences are; and (2) how the Paraphrasing Strategies use that information in the translation to text structure and the generation of the paraphrase.</Paragraph> </Section> <Section position="3" start_page="431" end_page="434" type="sub_section"> <SectionTitle> 4.1 Phase 1: The Problem Recognizers </SectionTitle> <Paragraph position="0"> As we discussed earlier, problem recognizing specialists have been embedded in the understanding system. Here we look at the NP Conjunction Ambiguity specialist and the two parse paths that correspond to the parses resulting from a NP conjunction ambiguity (see Figure 2 below).</Paragraph> <Paragraph position="1"> The first task of this specialist is to annotate the parse path when a NP conjunction is encountered by the parser. In IRUS-II, when the RUS parser has completed the processing of the first NP the frigates and the conjunction word and, it attempts (among other alternatives) to parse the next pltrase as a NP. At this point the Conjunction Ambiguity Specialist annotates that parse path with a NP-CONJUNCTION-AMBIGUITY tag (depicted in Figure 2 with * at the first NPLIST/ state in both parse paths 1 and 2). This annotation will allow the different interpretations that may result from this NP conjunction to be grouped later according to their common ambiguity source. (Note that if we were not using an ATN, appropriate annotations could still be made using structure building rules associated with the grammar rules). The paraphraser can then organize its paraphrases according to a group of related ambiguous interpretation,;. As previously stated, presenting closely related interpretations simultaneously is more effective than presenting randomly generated paraphrases that correspond to arbitrary parse paths.</Paragraph> <Paragraph position="2"> The second task of the NP Conjunction Ambiguity specialist is to monitor those TRANSMITs to the semantic interpreter fliat may result in multiple intelpretations (WMLs) from the same source of ambiguity. Thus, starting from when the possible ambiguity has been noticed, this specialist will monitor the TRANSMITs to all the modifiers of the NPs. In our example, the NP Conjunction Ambiguity specialist monitors the TRANSMITs of the prepositional phrase (PP) in the Indian Ocean to all NPs annotatexi with the NP-. CONJUNCTION-AMBIGUITY tag (TRANSMITs are illustrated with **), which include the TRANSMITs of that PP as a postmodifer to each of the conjoined NPs (parse path 1) as well as to only the second NP (parse path 2). Since the PP in the Indian Ocean is semantically acceptable as a postmodifer in both parse paths, two intermediate WMLs are be created: restriction. The NP Conjunction Ambiguity specialist annotates those intenn,~diate WMLs, and the parser proceeds to complete the processing of the inpttt text. In our example, two final WMLs are generated, one for each of the two SETOF expressions that originated from rite same NP.CONJUNCTION-AMBIGUITY source: WMI.r 1: (ttR1NG-ABOUT</Paragraph> </Section> </Section> <Section position="7" start_page="434" end_page="434" type="metho"> <SectionTitle> ((INTENSION </SectionTitle> <Paragraph position="0"> More complex sentences that contain postmodified NP eo~tjnnctioz~ may have additional interpretations. For instance, the sentence The carriers were destroyed by frigates and subs in the lmlian Ocean may have a third interpretation in which the PP in the Indian Ocean modifies the whole clause. Another more complex example is&quot; The carriers were ck,stroyed by 3 fi-igates attd subs in the Indian Ocean, in which ambiguity specialists for NP conjunction, PP clause aUachment mad quantifier SCOl:fing will interact. This kind of interaction among specialists is a topic for our current research on effective paraphrasing.</Paragraph> <Paragraph position="1"> 4.7, Phase 2: 'l~rm~slating from WML to Text Structure Once the l~roblen't Recognizers have annotated the WML, the text planne~ t;d,:es over to translate the imensional logic expression into the hie~'archical text structure which organizes the objects and ~'elations SlW.cified. In this example, since the input was ambiguous m M there are two WMLs, there are two possible strategies for paraphz~tsing which apply at this step: (1) Paraphrase of each interpretation separately (as discussed in Secl ion 2).</Paragraph> <Paragraph position="2"> (2) C.ombiae them into a single paraphrase using formatting and highlighting to contrast the differences: Di,wlay th,~ carriers in the Indian Ocean and the frigates in the Indian Ocean or the carriers in the Indian Ocean and all the frigates.</Paragraph> <Paragraph position="3"> We will focus here on the second strategy, that which combines the interpretations. The text planner will begin by translating one of the WMLS and when it reaches the subexpression that is annotated as being ambiguous, it will build a text structure object representing the disjunction of those subexpressions.</Paragraph> <Paragraph position="4"> As discussed in Section 3.2, the translation to text structure uses both explicit and implicit information from the WML. In this case, the translation of the first operator, BRING-ABOUT builds a complex-event object marked as a command in the present tense and the agent is set to *you*. The domain model concept DISPLAY provides the matrix verb (see text structure in Figure 3).</Paragraph> <Paragraph position="5"> When the translation reaches the SETOF expression, a COORDINATE-RELATION object is built containing both subexpressions with the relation DISJUNCTION. It is also annotated &quot;emphasize-contrast&quot; to guide the later decision making. As this node and its children are expanded, the annotation is passed down. Wizen the translation reaches the individual conjtmcts in the expression, it uses the annotation to decide how to expand the text structure for that object. In the case where the modifier distributes, the annotation blocks any optimization that may lead to an ambiguity, and ensures both conjuncts will be modifiexl; in the case where it does not distribute, there are two possible strategies to eliminate the ambiguity: 2 1) Manipt,lating the order of the conjuncts in the text structure: --If only one of the conjuncts is modified attd the modifier is realizable as a premodifier, then that conjunct should be placed second.</Paragraph> <Paragraph position="6"> --If only one of the conjuncts is modified and the modifier is realizable as a postmodifier, then that conjunct should be placed first.</Paragraph> <Paragraph position="7"> In this case, the paraphrase would be: Display the frigates in the ImIian Ocean and carriers.</Paragraph> <Paragraph position="8"> 2) Adding a quantifer, such as &quot;all&quot;, to the conjunct without modification by adding an adjtmct DO to the second conjunct, which would result in the paraphrase: Display all the carriers and the frigates in the Indi,'m Ocean.</Paragraph> <Paragraph position="9"> We use a combinalion of these strategies. Figure 3 shows tbe partial text stuctare built for this expression 3.</Paragraph> <Paragraph position="10"> 2 Note that in this task of paraphrasing queries, where it is crucial that the paraphrase be unambiguious, these are strategies the generator should apply regardless of whether the original was ambiguous or not, as anthiguity may have been introduced into a conjunction by some other strategy, such as lexical choice.</Paragraph> </Section> <Section position="8" start_page="434" end_page="435" type="metho"> <SectionTitle> 3 Objects labeled DO in tile diagram indicate discourse objects which have been </SectionTitle> <Paragraph position="0"> created for this utterance. Objects labeled DM are obieets from the domain model. The creation of discourse objects allows objects to be annotated with their roles and other information not contained in the domain model (tense, number) and introduces objects which can be referred back to anaphorically with pronouns (e.g. &quot;they&quot; for the DO dominating the conjuncts).</Paragraph> <Paragraph position="1"> Once this level is complete, it is traversed and the linguistic resources, such as the lexical heads and major syntactic categories, are chosen and represented in the input specification to the lingusitic realization component, MUMBLE-86, which produces the final text.</Paragraph> </Section> <Section position="9" start_page="435" end_page="435" type="metho"> <SectionTitle> 5. USING TIlE PARAPHRASER IN A COOPERATIVE DIALOG SYSTEM </SectionTitle> <Paragraph position="0"> The work presented here has focused on developing strategies for paraphrasing in order to resolve ambiguity. However, in an actual NL dialog system, choosing when and how to use this capability can be based on other considerations. In this section we address some practical issues and some related work we have done in the integration of our paraphraser into a Man-Machine hltel-face.</Paragraph> <Paragraph position="1"> The presentation of a paraphrase can be useful even in cases where no ambiguity has been detected, as it allows the user to verify that the system's interpretation does not differ from the intended interpretation. This is particularly useful for new users who need to be reassured of the system's performance. This feature should be under the user's control, though, since frequent users of the system may only want to see paraphrases when the system finds multiple interpretations.</Paragraph> <Paragraph position="2"> Paraphrasing can also be incorporated in cooperative responses in order to make any presuppositions explicit. Consider the following exchange: U: Display all the carriers.</Paragraph> <Paragraph position="3"> S: <icons displayed on map> U: Which are within 500 miles of Hawaii? S: Carriers Midway, Coral Sea, and Saratoga.</Paragraph> <Paragraph position="4"> U: Which have the highest readiness ratings? S: Of the carriers within 500 miles of Hawaii, Midway and Saratoga are el.</Paragraph> <Paragraph position="5"> Incorporating elided elements fi'om previous queries in the response makes clear which set is being considered for the cun'ent answer. Another sort of paraphrase, which we term &quot;diagnostic responses&quot;, can be used when the system is unable to find any interpretation of the user's query, due to ill-fonnedness, novel use of language, or simply inadequate information in the underlying program. As in paraphrasing, the generator uses structures built by the understanding component to generate a focused response. For example, a metaphorical use of &quot;commander&quot; to refer to ships, as in the following query will violate the semantic restrictions on the arguments to the verb &quot;assign&quot;. When IRUS-II fails to find a semantic interpretation, it saves its state, which can then be used by the generator to produce an appropriate response: U: Which commanders are assigned to SPA 2? S: 1 don't understand how commanders can be assigned.</Paragraph> </Section> <Section position="10" start_page="435" end_page="435" type="metho"> <SectionTitle> 6. COMPARISON WITtl OTHER WORK </SectionTitle> <Paragraph position="0"> A similar approach to ours is McKeown's Co-op system (McKeown, 1983). It too functions in an interactive environment.</Paragraph> <Paragraph position="1"> However, it is limited in several ways: 1) Since the system it worked with was limited to data base queries, it could only paraphrase questions. This is not only a limitation in functionality, but affects the linguistic competence as well: the input had to be simple WH- questions with SVO structure, no complex sentences or complicated adjuncts.</Paragraph> <Paragraph position="2"> 2) It had only one strategy to change the text: given and new 4, which fronted noun phrases with relative clauses or prepositional phrases that appeared in the later parts of the sentence (essentially the verb phrase). For example Which programmers worked on oceanography projects in 1972? would be paraphrased: Assuming that there were oceanography projects in 1972, which programmers worked on those projects? 3) Since its only strategy involved complex noun phrases, if there were no complex noun phrases in the query, it would be &quot;paraphrased&quot; exactly as the original.</Paragraph> </Section> <Section position="11" start_page="435" end_page="436" type="metho"> <SectionTitle> 4 A related problem is that its notion of given and new was very simplistic: it </SectionTitle> <Paragraph position="0"> is purely based on syntactic criteria of the incoming sentenceand does not consider other criteria such as definiteness or context.</Paragraph> <Paragraph position="1"> Lowden and de Roeck (1985) also adch'ess the problem of paraphrasing in the context of data base query. However, while they assume some parse of a qumy has. taken place, the work focuses entirely on the generation portion of the problem. In fact, 'they define paraphrasing as providing a &quot;mapping between an underlying t'ormal representation and an NL text.&quot; They discuss in detail how text formatting can improve clarity and a solid underlying linguistic framework (in theh' case lexical functional grammar) can insure grammaticality, llowever, while they state that a parapla'ase should be unambiguous, they do not address how to recognize when a query is ambiguous or how to generate an unambiguous query.</Paragraph> <Paragraph position="2"> The BBN Parlaneerra NL Interface.is one of the most robust NI, interfaces in existance. Its paraphraser integrates both the system's conceptual and procedural understanding of NL queries. This approach is based on the observation that users need to be shown the conceptual denotation of a word or phrase (e.g., &quot;clerical employee&quot;) with its denotation in the underlying database system (e.g., an employee whose EEO category is 3 or an employee whose job title is &quot;secretary&quot;). Thus, the Parlance paraphrases incortyorate references to specific fields and values in the underlying data base system. So, while the text can be cumbersome, it has the advantage of more directly capturing what the system understood. Due to efficiency considerations and limitations on the space for output, the Put'lance paraphraser presents the paraphases one at a time, allowing the user to confirm or reject the curt'cut interpretation, rather than presenting all paraptn'ases at the stone time. The system allows the user to refer back to previously presented interpretations, but as is the case with the other paraphrasers, related interpretations are not contrasted.</Paragraph> </Section> class="xml-element"></Paper>