File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-1414_metho.xml

Size: 25,243 bytes

Last Modified: 2025-10-06 14:15:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-1414">
  <Title>Institut fiir Wissensund Sprachverarbeitung -Abstract</Title>
  <Section position="3" start_page="128" end_page="128" type="metho">
    <SectionTitle>
2 Discourse markers in NLG
</SectionTitle>
    <Paragraph position="0"> We follow Moser and Moore \[1995\] in assuming that three distinct though interrelated decisions have to be made when generating discourse markers: Whether to place a marker or not (marker occurrence), where to place a marker (marker placement), and finally, which marker to use (marker selection). Research on connectives in the context of NLG has focused on the selection of markers to produce coherent and cohesive multi-sentential text. Studies fall into two distinct groups: First, studies are concerned with identifying the characteristic *properties of a small set of similar markers, and determining the reasons behind choosing a particular marker from this set in a given context; examples are the markers since and because \[Elhadad and McKeown 1990\], or the temporal mark, ers before * and while \[Dorr and Gaasterland 1995\]. Second, a number of studies take particular (RST-)relati0ns as a starting-point, and examine how these relations are signalled on the linguistic surface; examples are the PURPOSE, RESULT and PRECONDITION relations \[Vander Linden 1994\], the CONCESSION relation \[Grote et al. 1997\], and the subject-matter relations occurring in a technical domain \[RSsner and Stede 1992, Delin et al. 1996\]. However , these are all isolated studies, geared towards a particular application. There is at present no overall framework that supports informed and motivated marker generation for more than a small set of markers and relations.</Paragraph>
    <Paragraph position="1"> The broadest overview on discourse markers to our knowledge is the descriptive work of Knott and Mellish \[1996\], but it does not specifically address the NLG perspective. Moser and Moore \[1995\] and DiEugenio et al. \[1997\] also take a broader view on marker production in that they try to determine general factors that influence the use of markers in text, and in that they consider more than pairs of propositions.* However, they are largely * concerned with marker occurrence and placement, not with marker selection.</Paragraph>
  </Section>
  <Section position="4" start_page="128" end_page="128" type="metho">
    <SectionTitle>
3 Sentence planning
</SectionTitle>
    <Paragraph position="0"> The traditional SSplit of NLG systems in a content determination/what-to-say component and a realization/how-to-say component was in recent years *supplemented by an intermediate stage: sentence planning, sometimes called micro-planning (e.g., Rambow and Korelsky \[1992\]). The primary motivation for this step is to relieve the text planner from language-specific knowledge, * and to relieve the realization component from any planning or decision-making that potentially * 129.</Paragraph>
    <Paragraph position="1"> affects the mean.ing of the utterance. Hence, better control of the overall generation process is gained. We do not elaborate the advantages further here; see, for example, \[Panaget 1994\].</Paragraph>
    <Paragraph position="2"> What are the specific decisions to be made by the sentence planner? We think it is important to separate the format{re decisions from the motivations that lead to the particular choices. Following Wanner and Hovy \[1996\], a sentence planner has tomake the following decisions: Fine-grained discourse structuring, including discourse marker choice; sentence grouping and sentence content determination; clause-internal structuring; Choice of referring expressions; lexical choice} Two groups of considerations are important for these tasks: First, the motivating factors such as stylistic choices, semantic relations, intentions, theme development, focusing, discourse history. Second, the interactions with other desicions, because different formative decision may realize the same motivation. In contrast to present NLG systems, which realize the production of marker choices as a mere consequence of other sentence level decisions, we think that sentence planning interactions ough t to be respected for discourse markers, too, as the following examples illustrate: * Ordering of related clauses (cause-effect vs. effect-cause) &amp;quot; .</Paragraph>
    <Paragraph position="3"> Because he was unhappy, he asked to be transferred, vs. He asked to be transferred, for he was unhappy, vs. * For he was unhappy, he asked to be transferred.</Paragraph>
  </Section>
  <Section position="5" start_page="128" end_page="128" type="metho">
    <SectionTitle>
* Aggregation
</SectionTitle>
    <Paragraph position="0"> He has quarrelled with the chairman. He resigned from his post. vs. He has quarrelled with the chairman and resigned from his post.</Paragraph>
    <Paragraph position="1">  He will not attend unless he finishes his paper, vs. He will attend if he finishes his paper. Due to these interdependencies, any fixed order of making decisions in sentence planning wiil impose limitations on the expressiveness of the system. Accordingly, we advocate a flexible order of decisiommaking, as it can be realized in a blackboard-based architecture such as proposed by DIOGENES \[Nirenburg et al. 1989\] and HealthDoc \[Wanner and Hovy 1996\]. Moreover, the individual modules or knowledge sources should rely on declarative representations as much as possible; otherwise the control process becomes extremely complicated. And one of the declarative sources of information, we feel, should be a lexicon that assembles *specifically the information associated with discourse markers.</Paragraph>
  </Section>
  <Section position="6" start_page="128" end_page="132" type="metho">
    <SectionTitle>
4 **The Discourse marker lexicon
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="128" end_page="128" type="sub_section">
      <SectionTitle>
4.1 Discourse markers as lexical entities
</SectionTitle>
      <Paragraph position="0"> The traditional clistinction between content words and function words (or open-Class and closed-class items) relies on the stipulation that the former have their &amp;quot;own&amp;quot; meaning independent of the</Paragraph>
      <Paragraph position="2"> context in which they are used, whereas the latter assume meaning only in context. Then, content words are assigned to the realm of the lexicon, whereas function words are treated as a part of grammar. For dealing with discourse markers, we do not regard this distinction as particularly helpful, though. These words can carry a wide variety of semantic and pragmatic overtones, which render the task of selecting a marker meaning-driven, as opposed to a mere consequence of structural decisions.</Paragraph>
      <Paragraph position="3"> Furthermore, notice that a number of lexical relations customarily used to assign structure to the universe of &amp;quot;open Class&amp;quot; lexical items can be applied to discourse markers as well. For instance, the German words obzwar and obschon (both more formal variants of obwohl = although) are at least very close to being synonyms. As for plesionyms (near-synonyms), although and though, according to Martin \[1992\], differ in formality, and although and even though differ in terms of emphasis. If and unless can be seen as antonyms, as they both express conditionality, but with opposite polarity. Some markers are more specific than others, thus display hyponymy. E.g., but can signal a general CONTRAST or a more specific CONCESSION. Finally, other than being more or less specific, some markers can signal quite different relations; e.g., while can be used for TEMPORAL CO-OCCURRENCE, and also for CONTRAST. Hence, the marker is polysemous.</Paragraph>
      <Paragraph position="4"> For these reasons, discourse markers should be described by a dedicated lexicon that provides a classification of their syntactic, semantic and pragmatic features and characterizes the relationships between similar markers. This will be a lexicon whose main grouping criterion is function rather than grammatical category; not surprisingly, this is motivated by the production perspective, where the parameters governing the generation decisions play the central role.</Paragraph>
    </Section>
    <Section position="2" start_page="128" end_page="131" type="sub_section">
      <SectionTitle>
4.2 Methodology
</SectionTitle>
      <Paragraph position="0"> Methodological considerations pertain to the two tasks of determining the set of words we regard as discourse markers, and determining the lexical entries for these words.</Paragraph>
      <Paragraph position="1"> Finding the &amp;quot;right&amp;quot; set of discourse markers is not an easy task, since the common lexicographic practice of having syntactic behaviour as the criterion for inclusion does not apply. Knott and Mellish \[1996\] provide an apt summary of the situation. Their 'test for relational phrases' is a good start, but geared towards the English language (we are investigating German as well), and furthermore it catches only items relating clauses; in Despite the heavy rain, we went for a walk it would not detect a cue phrase. To identify more markers, we worked with traditional dictionaries and with grammars like Quirk et al. \[1972\] and Helbig and Buscha \[1991\]. The resulting set of markers is further validated by investigating coherence relations and their possible realizations; here, we can draw on our earlier work \[RSsner and Stede 1992, Grote et al. 1997\].</Paragraph>
      <Paragraph position="2"> As for the shape of the lexical entries, there are two tasks: First, determining the distinguishing features and classifying markers according to these features, and second, finding appropriate computational representations. At present, we axe mostly concerned with the first step, but in section 5, we make an initial proposal for representations.</Paragraph>
      <Paragraph position="3"> Regarding the set of features, our goal can be characterized as finding a synthesis of two different perspectives on marker description, between which there has been little overlap in the research literature: Text linguistics considers markers as a means to signal coherence, and provides us with insights on the semantic and pragmatic properties of marker Classes, hence approaches the matter &amp;quot;top-down&amp;quot;. On the other hand, grammars and style guides provide syntactic, semantic and stylistic properties of individual markers, thus look &amp;quot;bottom-up&amp;quot;.</Paragraph>
      <Paragraph position="4"> Specifying the distinctions within sets of similar markers can be quite subtle. In addition to drawing on our earlier work cited above, we employ techniques such as paraphrasing, Knott's substitution test \[Knott and Mellish 1996\], analysis of typical distributions using corpora, and contrastive studies. Extracting features in this way seems justified since at this stage we arc unlike  feature unless for however even though notwithstanding  DiEugenio et al. \[1997J--not concerned with the predictive power of individual features but rather with decomposing markers into features that are relevant for integrating marker choice into sentence planning.</Paragraph>
    </Section>
    <Section position="3" start_page="131" end_page="132" type="sub_section">
      <SectionTitle>
4.3 The shape of the lexicon
</SectionTitle>
      <Paragraph position="0"> The initial set of features we have thus obtained can be grouped in the traditional categories: Syntactic features are the part-of:speech of a marker and the type of connection it establishes (prepositions link constituents within a clause; conjunctions build a paratactic or hypotactic structure, but some can also function as intersentential linkers). The scope of a marker is the complexity of the segments it can combine (complex subtree or simple propositions). The linear ordering of the conjuncts can differ from marker to marker (e.g., with the connective for, the subordinate clause is always postponed ) as well as the marker's position within the segment (e.g., prepositions always occur at the beginning of a segment; adverbs like however can occur in front-, mid- and end-position).</Paragraph>
      <Paragraph position="1"> Semantic features are foremost the semantic relation established (e.g. causal or temporal link). Some markers show a particular behaviour towards negation, which is related to polarity (e.g., ffversus unless). Further, we observe that Certain markers impose what we term a functional ordering, for instance, for requires the order effect-cause.</Paragraph>
      <Paragraph position="2"> Pragmatic features include the discourse relation expressed by the marker and the type of illocutionary acts it conjoins (e:g., German well links propositions, denn links judgements). Some markers differ in terms of presuppositions and the assignment of given/new (e.g., because versus since). Stylistic features represent dimensions like formality and emphasis.</Paragraph>
      <Paragraph position="3"> To illustrate how these features discriminate between markers, table 1 gives five preliminary sample entries. N is a shorthand for nucleus in the RST sense, S for satellite. Notice that table</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="132" end_page="132" type="metho">
    <SectionTitle>
5 The discourse marker lexicon in sentence planning
</SectionTitle>
    <Paragraph position="0"> Having outlined the discourse marker lexicon as a general resource, we now turn to the question of using it in sentence planning. Even though the lexicon is still under development, we will illustrate with several prototypical representations how a sentence planner can exploit the various realization options offered by the lexicon.</Paragraph>
    <Paragraph position="1"> We assume the following framework: a discourse structure tree loosely based on RST \[Mann and Thompson 1988\] serves as input to the sentence planner. RS-trees comprise a set of propositions as leaf nodes; the internal nodes represent coherence relations holding between the daughter nodes. The tree is encoded in the description logic LOOM \[MacGregor and Bates 1987\], and the propositions are represented following the ontology used in the MOOSE system \[Stede 1996\]. The nature Of these representations need not concern us here, but it is important thatthey are all &amp;quot;grounded&amp;quot; in a knowledge base so that type checking via subsumption can take place.</Paragraph>
    <Paragraph position="2"> The output of the sentence planning module is a sequence of lexicalised sentence-semantic specifications (SemSpecs), based on SPL \[Kasper 1989\]. Accordingly, sentence planning in this framework amounts to linearizing a discourse representation tree. As front-end sentence generator, we use KPML \[Bateman 1997\]. A sample input structure from the domain of maintenance manuals is given in figure I; figure 2 shows one possible realization. Numbers in the tree correspond to text segments, and each segment corresponds to one underlying proposition.</Paragraph>
    <Paragraph position="3"> \[Wait\]l until \[the engine is cool\]2, then \[turn the radiator cap clockwise\]3 until \[it stops\]4. \[DO</Paragraph>
  </Section>
  <Section position="8" start_page="132" end_page="135" type="metho">
    <SectionTitle>
NOT PRESS DOWN WHILE TURNING THE CAP\]5. After \[any remaining pressure has
</SectionTitle>
    <Paragraph position="0"> been relieved\]6, \[remove the cap\]7 by \[pressing down\]8 and \[again turning it counterclockwise\]9.</Paragraph>
    <Paragraph position="1"> \[Add enough coolant\]10 to \[fill the radiator\]ll, and \[reinstall the cap\]12. \[Be sure to tighten it securely\]13. \[Fill the reserve tank up to the max mark\]14 with \[the engine cold\]15.</Paragraph>
    <Paragraph position="2"> * Figure 2: One linguistic realization of the RST-tree.</Paragraph>
    <Paragraph position="3"> 5.1 The &amp;quot;generation view&amp;quot; of the discourse marker lexicon From the production perspective, the lexical features are to be classified with respect to when and where they come into play in the generation process; this amounts to one particular &amp;quot;view&amp;quot; on the information coded in the lexicon. We propose these categories: * Applicability conditions: The necessary conditions that need to be present in the input representation for the marker to be a candidate. Chiefly, this is the semantic/discourse relation to be expressed, and also (if applicable) features pertaining to presuppositions and intentions.</Paragraph>
    <Paragraph position="4">  * Combinability conditions: The constraints that the marker imposes on its neighbouring linguistic constituents (the 'syntagmatic' dimension). These are syntactic constraints on sub-categorization and semantic type constraints, which interact with other realization decisions in sentence planning.</Paragraph>
    <Paragraph position="5"> * Distinguishing features: If preferential choice dimensions, such as style, brevity, etc., are attended to in the *system, then these features serve to distinguish markers that are otherwise (nearly) synonymous (the 'paradigmatic' dimension).</Paragraph>
    <Paragraph position="6">  For encoding this information, we adopt the framework used in the lexicalization approach of the MOOSE sentence generator \[Stede 1996\]. Here, lexicon entries consist of (inter alia) the three zones denotation, partial SemSpec (PSemSpec), and stylistic features. The *denotation is the part to be matched (qua subsumption) against the input rePresentation; it may contain type restrictions. The PSemSpec is an SPL-like template that includes a : lex annotation with the actual lexeme and possibly variables that are replaced by other PSemSpecs in the course of the lexicalization process. Also, any realization directives needed by the front-end generator are stated here. Stylistic features are used for preferential choice between words that would all be applicable in a particular context. When generalizing this framework to include discourse markers (and hence allowing for producing complex sentences), the denotation of a marker would be an RST relati0n 2 with variables for the relata, possibly enriched with type constraints. For relations with a nuCleus and a satellite, we always write them in this order, hence (RELATION NUCLEUS SATELLITE). As a simple case, consider the subordinating conjunction until, which we take to be a marker of the relation UNTIL 3, a straightforward case indeed. Its denotation is (UNTIL X (STATE Y)), meaning that it can be used to verbalize any UNTIL node whose satellite is of type STATE, according to the ontology or domain model in the knowledge base.</Paragraph>
    <Paragraph position="7"> The variables used in the denotation also :appear in the PSemSpec of until, so that partial SemSpecs canbe combined together correctly. Here, the nucleus of the UNTIL relation becomes the domain of the rst-until relation as defined in the KPML Upper Model, 4 and the satellite is mapped to range, which we further constrain to be a relational-process (in Upper Model terms). Furthermore, we add :theme X to ensure that the nucleus is ordered before the satellite (to avoid until Y, X). The complete lexicon entry together with a few more exa=mples is given in figure 3: The denotations and PSemSpecs for the subordinating conjunctions until marking UNTIL and after, if, then, unless marking PRECONDITION, and for the preposition with in its function as marker for the relation PRECONDITION.</Paragraph>
    <Section position="1" start_page="133" end_page="135" type="sub_section">
      <SectionTitle>
5.2 Marker choice
</SectionTitle>
      <Paragraph position="0"> In MOOSE, lexical options constitute the search space for building a lexicalized semantic sentence specification. Now, we generalize this idea to discourse trees: For propositional nodes MOOSE calculates all possible lexical options; for coherence relation nodes, the list of options realizing the node is taken from the discourse marker lexicon by matching the node against the applicability conditions of the lexicon entries. Thus, the entire discourse tree is annotated with verbalization options, which together constitute the search space for sentence planning.</Paragraph>
      <Paragraph position="1"> ~The relations used in denotations effectively constitute the interface between the lexicon and the text planner producing the discofirse tree. At present we use RST, but we regard this only as an interim solution. For the purposes of this paper, the precise inventory of relations used is not critical.</Paragraph>
      <Paragraph position="2">  To illustrate this approach, consider the propositions 14 and 15 in the sample text: Fill the reserve tank with the engine cold. Here, the PRECONDITION relation is Signalled by the intraclausal linker with (see the lexicon entry above). Other realizations of this RS subtree are (\[Vander Linden 1994\] offers a similar range):  1. If the engine is cold, fill the reserve tank up to the max mark.</Paragraph>
      <Paragraph position="3"> 2. When the engine is co/d, fill the reserve tank up to the max mark.</Paragraph>
      <Paragraph position="4"> 3. Fill the reserve tank up to the max mark, only if the engine is cold.</Paragraph>
      <Paragraph position="5"> 4. After the engine has cooled down, fill the reserve tank to the upper mark.</Paragraph>
      <Paragraph position="6"> 5. Do not fill the reserve tank (up to the max mark) unless the engine is cold.</Paragraph>
      <Paragraph position="7"> 6. Make sure that the engine is cold. Then, fili the reserve tank up to the max mark.</Paragraph>
      <Paragraph position="8">  To arrive at variant formulations of this kind, depending on different parameters and/or context, our first step is to set up the search space of verbalization options. While MOOSE performs this step for propositions, we will here focus on the coherence relation nodes. In our example, the marker lexicon yields a set of markers that match the applicability condition (PRECONDITION X Y): after, if, only if, then, unless, when and with. These are annotated at the node, as shown in figure 4, where the leaf nodes are annotated with (shorthands for) some of the lexical options found by  From this search space, different decisions made by sentence planning &amp;quot;expert&amp;quot; modules lead to different verbalizations. For instance, assume that the sentence-structuring expert calls for a hypotactic structure; this is satisfied by PSemSpecs of the form: (r / rst-precondition :domain X : range Y), hence by the markers if, only if, unless and when. If the clause-ordering expert calls for the order satellite-nucleus, unless is ruled out as it requires the nucleus to be stated first (see the lexicon entry below). The remaining choice between only if, ffand when is left to fine-grained discrimination (e.g., only if is more emphatic), which we do not elaborate here.</Paragraph>
      <Paragraph position="9"> Alternatively, assume that the sentence-delimitation expert posits that the relation be expressed * in two separate sentences. As a consequence, the ordering is satellite-nucleus. These constraints - are satisfied only by the marker then (example 6). The sequence of PSemspecs associated withthe marker then further constrains the other sentence planning decisions (see figure 3).</Paragraph>
      <Paragraph position="10"> Now, it might also be the case that the lexicalization expert (e.g., MOOSE) calls for verbalizing the result of the cooling process (proposition \[15\]) only and proposes the lexeme be cool Now, the marker after is out, as it requires the satellite to be realized as a subordinate clause with a process of type activity (see lexicon entry of AFTER in figure 3). Alternatively, if the lexeme chosen is cool down, markers such as with are not available, as its PSemSpec allows for combining with a property-ascription only. Now, if some other expert decides to use a negation with the nucleus, unless is selected as marker since it expects a negativ e polarity in the nucleus; its (partial) lexicon entry is shown figure 3.</Paragraph>
      <Paragraph position="11"> Selecting unless in turn restricts the options for other sentence planning tasks, since its PSemSpec states that a hypotactic structure with the subordinate clause in sentence-final position is needed (due to the :theme X line). In short, decisions can be propagated in both directions: from other formative decisions to marker choice, and from marker choice to other decisions. Imagine that the process of tree linearization be driven* by the overall goal of producing concise text; in this case, the flexibility in ordering decisions allows for producing short text by choosing with and letting the other decisions follow.</Paragraph>
      <Paragraph position="12"> We have characterized a constraint-based mechanism that does not impose a strict order on making decisions in linearizing the discourse tree. Various ways of implementing such a scheme can be imagined; one is the blackboard-based approach suggested by Wanner and Hovy \[1996\], another is the &amp;quot;Hunter-Gatherer&amp;quot; search paradigm introduced by Beale \[1996\].</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML