XML Viewer - c88-2149

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-2149_metho.xml
Size: 36,882 bytes
Last Modified: 2025-10-06 14:12:11
<?xml version="1.0" standalone="yes"?>
<Paper uid="C88-2149">
  <Title>Issues in Word Choice</Title>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. Overview of FIG
</SectionTitle>
    <Paragraph position="0"> Before discussing the issues, I briefly present &amp;quot;FIG,&amp;quot; my generator. This is necessary because FIG handles many issues in ways which are not discussed elsewhere in the literature.</Paragraph>
    <Paragraph position="1"> FIG, short for &amp;quot;Flexible Incremental Generator,&amp;quot; was designed to be useful for both machine translation and cognitive modeling. It is based on the idea that speaking is a process of chonsing words one after  another. It has been incorporated into a prototype Japanese-to-English machine translation system. An example of its output is: (1) &amp;quot;One day the old man went to the hills to gather wood. and the Thanks to Terry Regier, Dan Jnsafsky, and Robert Wilansky. This work was supported in part by a Sloan Foundation grant to the Berkeley Cognitive Seienc~ Program and by the Defense Advanced Research Projects Aganey (DoD), Arpa Order No. 4871, monitored by Space and Naval Warfare Systems Comm~d under Contract N00039-</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Representation Characteristics:
</SectionTitle>
      <Paragraph position="0"> A single semantic network represents world knowledge and language knowledge. FIG uses a variant of Cognitive Representation Theory/Wilensky 1987/. The key characteristic for generation is that this representation is a semantic network which includes language knowledge, after Jacobs /Jacobs 1985b/. In particular, the network includes nodes for concepts, words, syntactic features, constructions, and constituents of constructions. (Node names are hencefonah set in bold and preceded by a single quote.) The links among nodes represent associations in world knowledge and language knowledge. In particular, there are links from concepts to words that express them.</Paragraph>
      <Paragraph position="1"> The energy level of a node represents its relevance at each point in time. A &amp;quot;relevant&amp;quot; word is one which could form part of the output, a &amp;quot;relevant&amp;quot; construction is one which could provide an appropriate structure to the output, and a &amp;quot;relevant&amp;quot; concept is one which is associated with the meaning to express. So that activation levels represent the current relevance of nodes there is an update mechanism. After a word is output, this mechanism: zeroes the energy of the word just emitted, zeroes the energy of that portion of the input which has been conveyed, and for each collstruction, zeroes the energy of constituents which have been completed~ Energy flow across links represents evidence for the relevance of a node. The energy level at each node is given by the sum of the energies reaching it from other nodes.</Paragraph>
      <Paragraph position="2"> To see how FIG chooses a word, suppose the input includes nodes like 'woman, 'old, 'live, and 'day; and that syntactic considerations are currently activating verbs. Then '&amp;quot;live*' will have the highest activation and be emitted next. It will have more energy than any other verb, since it also receives energy from the input; and it will have more energy than any other word suggested by the input, since it also receives energy from 'verb: Thus, FIG will emit &amp;quot;live&amp;quot; next. One can say that FIG is equally syntax-directed and semantics-directed.</Paragraph>
      <Paragraph position="3"> This brief discussion omits aspects of FIG of no direct relevance to word Gboice. Much more could be said about the exact activation algorithm, the representation of constructions, the role of link weights, the use of instantiation for utterances involving more than one occurrence of a word, and so on.</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="0" end_page="727" type="metho">
    <SectionTitle>
3. Basic Issues
</SectionTitle>
    <Paragraph position="0"> Each of the following points is illustrated by examples of output which a generator should not produce.</Paragraph>
    <Paragraph position="1">  Issue l: How are appropriate words cbosen? A generator must choose words appropriate for the input that it is called to expxx:ss. There is not much interesting to say about the simple case, in which a word is &amp;quot;appropriate&amp;quot; if it refers to some &amp;quot;concept&amp;quot; of the input meaning. This simple case can be handled with a dictionary mechanism to look up the word for a concept, or with a mechanism to traverse the link from a node to the word.</Paragraph>
    <Paragraph position="2"> However &amp;quot;appl~3priateness&amp;quot; is not always so simple. Three complications are discussed at length below, but first, it should be noted that many re,~earchers avoid these complicatinns. They do tiffs by considering them to be problems of &amp;quot;concept choice,&amp;quot; not word choice. This leads Thompson, tor example, to postulate a pre-processor, a &amp;quot;strategic component,&amp;quot; whose output only includes concepts which map easily to words ffhompson 1977/.</Paragraph>
    <Paragraph position="3"> Complication 1: The relation between a word and the input can be comph;x. For example, diverse facts about the input rule out: (2) &amp;quot;drink soup&amp;quot; if the soup is eaten with a spoon rather than by sipping from the bowl (vet,'su~ &amp;quot;eat soup&amp;quot;)  (3) &amp;quot;she went to the river&amp;quot; if it was narro~v, fast-moving, low-volume, etc (versus &amp;quot;she went to the stream&amp;quot;) (4) &amp;quot;he went to the hills&amp;quot; it' tile distance traveled was short, and he planned to stay in the hills for a while and move around there (versus &amp;quot;he went into the hills&amp;quot;) (5) &amp;quot;he met her&amp;quot; in the bus&amp;quot;  if the bus was running in scheduled service/Fillmore 1985/(versus &amp;quot;he met her on the bus'3 How can a generator choose words which depend on more than one element oftbe input? Thero are several answers: Goldmm~/Goldman 1974/analyzes words with complex meanings as having a core meaning plus conditions on use. For example lie considers INC;F.ST to be the core meaning of the word &amp;quot;drink.&amp;quot; This reduces the problem of word choice to the problem of choosing among the various words associated with a core element. Goldman's BABEL chooses by testing nearby nodes. For example, for INGEST it tests whether the object of ingestion is liquid in order to decide whether to t~se &amp;quot;drink.&amp;quot; As Danlos points out, the organization of tests into discrimination networks &amp;quot;is bound to be arbitrary&amp;quot;/Danlos 1987L After finding candidate words (explained below) Hovy's PAU-LINE-deg/Hovy 1987/matches the meaning of the word to the input to determine if the word is appropriate.</Paragraph>
    <Paragraph position="4"> In FIG words with complex meanings are simply suggested by more than ora: factor. For example, if tile input includes nodes like 'liquid and 'ingest, then '&amp;quot;drink&amp;quot; receives activation from both of them. This gives it high cumulative activation, which makes it likely to be chosen.</Paragraph>
    <Paragraph position="5"> Complication 2: The relation between a word and the input can be tenuoum For example, the words of a paraphrase can be &amp;quot;appropriate&amp;quot; even if they do not directly correspond to any element of the input, l~\[ovy gives the example (6) &amp;quot;In the primary on 20 February, Carter got 20515 votes. Kennedy got 21850.&amp;quot;.</Paragraph>
    <Paragraph position="6"> and comments, &amp;quot;if we want good text from our generators, we have to give them tile ability to recognize that &amp;quot;beat&amp;quot; or &amp;quot;lose&amp;quot; or &amp;quot;narrow lead&amp;quot; can be used instead of just the straightforward sentences.&amp;quot; How can a generatm' choose such WOldS? Hovy's PAULINE finds a set of &amp;quot;candidate&amp;quot; topics by considering concepts r.flated to the input nodes and also concepts whirl1 serve lbetorical goal~;. &amp;quot;t~ese topics then map to words.</Paragraph>
    <Paragraph position="7"> Jacobs' KING /Jacobs 1985b/ &amp;quot;searches&amp;quot; through world knowledge to find words. The search process only crosses links of certain types, which ensures that it only reaches words with equivalent meaning, such as &amp;quot;buy&amp;quot; for commercial-transaction.</Paragraph>
    <Paragraph position="8"> In FIG a word can receive energy even if it is not directly linked to a node in the input, via the links of world knowledge. This is simply a case of priming by memory associations. Activation attenuates every  time it crosses a link, which ensures a bias in favor of words which are &amp;quot;nearer&amp;quot; to the nodes of the input.</Paragraph>
    <Paragraph position="9"> Complication 3: The Input to a generator can include more than meaning. Consider: (7) The stream was the place where the old woman went.</Paragraph>
    <Paragraph position="10">  This utterance is strange, unless the stream is to be highlighted. In general, word choice can depend on the relative importance of tile portions of the input and on the way the input is &amp;quot;framed&amp;quot;/Fillmore 1985L How can these factors affect word choice? No existing generator seems to consider these factors wtlen choosing words. However, certain architectures seem more open to such factors. Generators with &amp;quot;open&amp;quot; architectures include Jacobs' PHRED/Jacobs 1985a/, which allows hashing on any factor for lexical access; and FIG, in which any factor can be a source of activation. Issue 2: How is coneisene~ ensured? A generator should not produce  (8) &amp;quot;a peach located at the surface of the water and supported by the water.&amp;quot; (versus &amp;quot;a floating peach'3 KING's knowledge consists of a taxonomy of concepts/Jacobs 1985b/, so it can simply choose the most &amp;quot;specific&amp;quot; word. Hovy's PAULINE chooses the word whose meaning configuration is &amp;quot;larg null est,&amp;quot; that is, the one whose meaning subsumes as much of the input as possible.</Paragraph>
    <Paragraph position="11"> FIG handles this rule without additional mechanism: words with &amp;quot;large&amp;quot; meanings become highly activated simply because they get energy from many nodes of the input. Thus, FIG has an intrinsic bias to use the most specific word possible. For example, if nodes like 'verb, 'motion, 'transitive-action, and 'initially-scattered are activated, then energy spreads to '&amp;quot;get&amp;quot; and to '&amp;quot;gather&amp;quot;. However, &amp;quot;gather&amp;quot; gets rated as more appropriate, since it receives energy from one more source than '&amp;quot;get&amp;quot; does, namely from 'inltially-scattered. Issue 3: Wlren does choice stop? This question can be stated more specifically as &amp;quot;when does the generator stop saying things about some topic?&amp;quot; The basic problem is avoiding redundancy.</Paragraph>
    <Paragraph position="12"> (9) She saw a peach floating in the stream, being moved by the current, and moving downstream.</Paragraph>
    <Paragraph position="13"> This utterance is redundant in that the information given by the words in bold is inferrable from the first clause. It should be noted that many researchers avoid this issue by assigning it to a pre-processor. This allows a generator to simply express all the nodes or propositions present in its input -- implicitly preserving the amount of information. FIG models inferrability with a simplified version of Norvig's marker-passing scheme/Norvig 1987/. Each time it chooses a word it &amp;quot;marks&amp;quot; the parts of the input which the reader can now infer. For example, after the words &amp;quot;gather&amp;quot; and &amp;quot;wood&amp;quot; are emitted it marks the 'gather-firewood node, representing the fact that that script has been cnnveyed. Only the unmarked input, representing the information that still needs to be said, is a source of activation. FIG terminates when it has marked all of the input.</Paragraph>
    <Paragraph position="14"> Issue 4: How are patterns of lexicalization respected? A generator must prefer words which belong to the lexicalization  patterns of the target language and genre. This issue has not yet been discussed in the generation literature, so I illustrate it with examples of output which violate lexicalization patterns.</Paragraph>
    <Paragraph position="15"> (10) &amp;quot;he entered the cellar running&amp;quot; (versus &amp;quot;he ran into the cellar&amp;quot;) There is a general preference to conflate motion and manner into the verb Nalmy 1975L (11) &amp;quot;his reliance on it was excessive&amp;quot; (versus &amp;quot;he relied on it too much'3 Actions are better expressed as verbs than as nominalizations, other things being equal. In general, there is a preference to use words which are of the correct part of speech for a given semantic need.</Paragraph>
    <Paragraph position="16">  (12) &amp;quot;he has stood up&amp;quot; (versus &amp;quot;he is standing&amp;quot;) States are best expressed by describing them, rather than by using the cause or the onset metonymically /Talmy 1985L (13) &amp;quot;let's eat at a restaurant&amp;quot; if the context is &amp;quot;what shall we do now?: (versus &amp;quot;let's go to a restaurant&amp;quot;) Complex actions are best expressed by mentioning the onset. (14) &amp;quot;an old person went to the stream and found a fruit&amp;quot;  (versus &amp;quot;an oM woman went to the stream and found a peach&amp;quot;) There is a preference to use basic level words and sex-specific words. No existing generator handles patterns of lexicalization. One possible approach would be to use special procedures: to &amp;quot;carve up reality,&amp;quot; for example to specify which information to conflate into a word; and to specify which aspects of a situation to encode, for example, which word to use for a metonymy. Within the FIG framework there are other possible solutions. There could be special nodes like 'wordsconflating-motion-and-manner' to give energy to appropriate words, or the relative densities Of knowledge about certain concepts could felicitously cause choice of basic-level words.</Paragraph>
    <Paragraph position="17"> Issue 5: How are interactions among choices handled? A generator must not, for example, violate collocations: (15) &amp;quot;high air currents&amp;quot; (versus &amp;quot;strong air currents,&amp;quot; yet htgh winds&amp;quot;). The problem here is that the choice of an adjective can depend on the noun chosen.</Paragraph>
    <Paragraph position="18"> The standard way to handle such things is to order choices. For example, heads are chosen first so they can constrain the choice of modifier. Usually the order of choices is fixed by the basic algorithm of the generator. For example, syntax-dtiven generators choose words in the order that they expand and traverse the syntax tree, and data-driven generators choose words in the order that they traverse the input /McDonald 1983/.</Paragraph>
    <Paragraph position="19"> In FIG there is no need to order choices. This is because the mere possibility of using a word can affect other choices. For example, if '&amp;quot;winds&amp;quot; seems relevant it will have energy, and this energy will spread to '&amp;quot;high&amp;quot;. (Recall that the network has links between associated words.) Other things being equal, such energy will make '!'high&amp;quot; be more activated than words such as '&amp;quot;strong&amp;quot; or '&amp;quot;fast&amp;quot;. Thus FIG will produce &amp;quot;high winds&amp;quot; but &amp;quot;strong air currents.&amp;quot;</Paragraph>
  </Section>
  <Section position="4" start_page="727" end_page="728" type="metho">
    <SectionTitle>
4. Syntactic Issues
</SectionTitle>
    <Paragraph position="0"> It makes no sense to choose words without regard to syntax. This section discusses some interactions of syntax and word choice. But first, I briefly sketch the syntactic theory which underlies FIG's treatment of grammar.</Paragraph>
    <Paragraph position="1"> Constmctiun Grammar is a theory of syntax currently being developed at Berkeley. Construction Grammar &amp;quot;aims at describing the grammar of a language directly, in terms of a collection of grammatical constructions&amp;quot;/Fillmore 1987/. Each construction represents a pairing of a syntactic pattern with a meaning structure. Construction Grammar  differs from most theories of language in accounting for the structure of complex grammatical patterns, such as lexically-headed constructions /Fillmore, Kay and O'Connor forthcoming/, rather than focusing on core syntax. It also differs in stressing the dependence of language on other aspects of cognition/Lakoff 1987/.</Paragraph>
    <Paragraph position="2"> A construction has &amp;quot;external syntax,&amp;quot; which describes where and when it is appropriate; and &amp;quot;internal syntax,&amp;quot; which describes its constituency structure. Consider, for example, the Existential There Construction/Lakoff 1987/, as in &amp;quot;once upon a time there lived an old man&amp;quot;. Two facts about the external syntax of this construction are that it is used to introduce people or things into a scene, and that it over-rides the normal subject-predicate ordering. The internal syntax of the Existential There Construction includes three constituents, roughly the word &amp;quot;there,&amp;quot; a verb, and a noun, in that order.</Paragraph>
    <Paragraph position="3"> Since Construction Grammar is based on declarative constructions rather than procedural rules, it is well suited to implementation with a network. In FIG constructions and their constituents are nodes of the network.</Paragraph>
    <Paragraph position="4"> Syntactic Issue 1: How are the correct parts of speech chosen? For example, a generator must avoid output like (16) 'When she got to the stream, her saw a peach which wus float there&amp;quot; (versus &amp;quot;she saw a peach which was floating there&amp;quot;) Syntax-driven generators typically handle this issue by setting up constraints and then finding a word that satisfies them. To use an old term, these generators do &amp;quot;lexical insertion.&amp;quot; Syntactic constraints can be manipulated in several ways. For example, a top-down generator accumulates constraints as it works down the tree, and these govern word choice at the leaves of the tree.</Paragraph>
    <Paragraph position="5"> In FIG constructions are linked to syntactic features which describe the syntactic characteristics of constituents. This allows activation to flow from constructions to features, and thence to words linked to those features. For example, suppose that 'ex-there, the node for the Existential There Construction, is activated. Energy will spread from 'ex-there to the feature 'verb, and from there to all verbs.</Paragraph>
    <Paragraph position="6"> Syntactic Issue 2: How is word order determined? Word order is not usually treated as a separate issue. This is because most generators handle it implicitly, as they follow through on syntactic choices. They do this by variously expanding trees, traversing networks, or matching templates.</Paragraph>
    <Paragraph position="7"> Appelt took a different approach: his planning-based generator manipulated word order explicitly/Appelt 1985/.</Paragraph>
    <Paragraph position="8"> In FIG word order is determined by the activation levels of various constituents of constructions. The update mechanism ensures that the activation level of each constituent correctly reflects the current syntactic state. Suppose, for example, that FIG has already emitted &amp;quot;Once upon a time, there&amp;quot;. Next it should emit a verb, according to the Existential There Construction. This is represented by having the second constituent of 'ex-there be highly activated at tiffs time.</Paragraph>
    <Paragraph position="9"> Energy flows from the second constituent to the feature 'verb, and from there to all verbs. Thus the activation levels of constituents help determine what word gets chosen and emitted next. This suffices to produce correct word order. In effect, constructions shunt energy to words which should appear early in the output.</Paragraph>
    <Paragraph position="10"> Syntactic Issue 3: How are words chosen to satisfy constituency? Constructions have constituency and words have valence, which a generator must respect.</Paragraph>
    <Paragraph position="11"> (17) &amp;quot;The woman went to the stream. When got to, she saw, to her surprise, an enormous peach.&amp;quot; is bad because verbs require subjects and because &amp;quot;got to&amp;quot; requires a destination. This issue is complicated by the existence of optional constituents. For example, consider the noun-phrase construction. )nlbrmation relevant to an object can often be expressed with an adjective, so that option must be available. But if there is no appropriate intormation the adjective option must be passed up. The general probletn of constituency can be stated as: in what way does syntax affect rite decision to use a word or not? Syntax-driven generators such as PENMAN/Mami 1983/handle constituency in their basic algorithm. The syntactic stntctare is determined before word choice is done. A common way to handle optional constituents is by augmenting the grammar with specifications of how to test the input. The results of these tests determine whether or not to use an optional constituent. This of course requires a special mechauism to execute these tests.</Paragraph>
    <Paragraph position="12"> FIG simply does not choose words for optional constituents mfless they are appropriate. In FIG each word &amp;quot;competes&amp;quot; with every other word in the lexicon to be the most highly activated. In particular, each word is in competition with words whicti could come later in the utterance. Tills suffices. For example, suppose FIG has just emitted &amp;quot;the&amp;quot;, and, accordingly, 'noun-phrase's second constituent is highly activated and its third constituent is ,somewhat activated. From these constituents the feature 'adjective gets a lot of activation mid the tbature 'noun gets somewhat less activation. There are two cases: 1. If the input includes some information expressible with an adjective, then both adjective(s) and nouns will get energy from the input, but an adjective will probably be emitted, since 'adjective is activated more highly than 'noun.</Paragraph>
    <Paragraph position="13"> 2. If there are no concepts which could be expressed with an adjective, then some noun will get energy both from the input and from 'nmm, but any adjective will o~lly have energy from one source, 'adjective. Thus a noun will probably be emitted next.</Paragraph>
    <Paragraph position="14"> Syntactic Issue 4: What ensures that a word stands in the correct relation to its neighbors? A generator must not scramble words, as in (18) &amp;quot;the green man went to the old hills&amp;quot; where the adjectives are attached to the wrong nouns (ve~ns &amp;quot;the old man went to the green hills&amp;quot;).</Paragraph>
    <Paragraph position="15"> The most common solution is to use syntax-directed teclmiques, similar to those discussed under Syntactic Issue 3. The grammar typically specifies the location of information for dependent words. For example, the generator might always follow &amp;quot;modified-by&amp;quot; links to ~each adjectives for a noun. A different formalism with the same effect is unification/Appelt 1983/.</Paragraph>
    <Paragraph position="16"> Since the input to FIG is a structure of linked nodes related concepts tend to be activated together. This means that FIG has, in effect, a &amp;quot;focus of attention&amp;quot;/Chafe 1980/. For example, if 'old-man37 is activated, then energy flows to related nodes, such as those encoding his appearance, location, and goals. Therefore at the time when 'man is highly actiwltod (and probably &amp;quot;man&amp;quot; is abont to be outpu0 nearby nodes, like 'old, become highly activated.</Paragraph>
  </Section>
  <Section position="5" start_page="728" end_page="728" type="metho">
    <SectionTitle>
5. Design Issues
</SectionTitle>
    <Paragraph position="0"> Thus, fl~ere are many issues in word choice. Their importance can be questioned -- after all, every existing generator ignores many of them, and yet generators have produced outputs which look quite good.</Paragraph>
    <Paragraph position="1"> However, close analysis shows that this is only because the inputs have bcen tailored to determine a good sentence. In other words, most generators' inputs are English sentences in disguise. Such generators only have to do the amount of computation needed to retrieve the target sentence. For example, (to oversimplify) Goldman's BABEL/Goldman 1974/really ortly had to choose among words with some common element of meaning; McDonald's MUMBLE /McDonald 1983/ really olfly had to clloose among alternative parts of speech for expressing node; and Mann's PENMAN/Mann 1983/ really only had to order words and choose syntactic options.</Paragraph>
    <Paragraph position="2"> This section briefly discusses some issues in designing a generator that handles all the complexities of word choice.</Paragraph>
    <Paragraph position="3"> Design Issue 1: In what order are the factors considered? As shown above, many factors can affect the decision to use a word. There are several ways to organize the factors.</Paragraph>
    <Paragraph position="4"> Goldman's BABEL has tests organized into a discrimination network. This means it always performs tests in the same order. For example, given a conceptualization which includes INGEST, it always tests &amp;quot;is the object a medicine&amp;quot; before testing &amp;quot;is file object a liquid.&amp;quot; Another way to organize word choice is with a two-stage algorithm. For example, BABEL ,selects a primitive then discriminates; PAULINE gathers candidates then filters them for relevance; KING chooses associations to find a node then chooses among words for that node; and Thompson's model considers speaker's goals to produce an &amp;quot;intention&amp;quot; then consults syntax.</Paragraph>
    <Paragraph position="5"> In FIG all factors contribute simultaneously.</Paragraph>
    <Paragraph position="6"> Design Issue 2: Are all words chosen in the same way? Many generators choose different types of words differently.</Paragraph>
    <Paragraph position="7"> Commnnly distinguished are open-class words and closed-class words /Pustejovsky and Nirenburg 1987/or content words and function words /Kempen and Hoenkamp 1987/, phrase-heads and modifiers/Goldman 1974/, and words with valence and words without.</Paragraph>
    <Paragraph position="8"> FIG has one uniform process for all types of word choice. Everything which affects word choice is just a source of energy. Of course it is true that different types of factors are more important for different types of choices. For example, energy from the nodes of the input is typically more important for open-class words than for closed-class words. However, this fact does not affect the structure of FIG.</Paragraph>
  </Section>
  <Section position="6" start_page="728" end_page="730" type="metho">
    <SectionTitle>
6. Design Principles
</SectionTitle>
    <Paragraph position="0"> FIG addresses all the above issues in word choice. It works, not because of the details of representation and energy flow, but because it embodies several design principles. This section states these principles as general maxims for generators design.</Paragraph>
    <Paragraph position="1"> Design Principle 1: Have an explicit representation of the status of the generation process at each point in time.</Paragraph>
    <Paragraph position="2"> FIG has a complete and explicit representation of the state, syntactic and semantic, at each moment of the generation process. This representation consists of the activation levels of many concepts, syntactic constructions, and words. This represents which factors and choices are relevant; in other words, it constitutes the &amp;quot;working memory&amp;quot; of the generator. This representation makes all relevant information available for each successive decision to use a word.</Paragraph>
    <Paragraph position="3"> This contrasts with generators in which information is implicit.</Paragraph>
    <Paragraph position="4"> for example, in the current value of a pointer or in the parameters of a function call. This also contrasts with generation based on stages. A stage model partitions the factors in choice into sets. There is no clear motivation for such a partition. Moreover, use of a stage model limits the availability of different types of information to different times.</Paragraph>
    <Paragraph position="5"> Design Principle 2: Use a single, unified representation.</Paragraph>
    <Paragraph position="6"> FIG is &amp;quot;unified&amp;quot; in two senses: all knowledge is part of one network, and information propagates freely by means of spreading activation. Nodes for compatible choices, of all sorts, are linked and therefi)re mutually reinforcing. This implies that activation levels tend to converge (or &amp;quot;settle&amp;quot; or &amp;quot;relax&amp;quot;) into a state which represents a consistent set of choices.</Paragraph>
    <Paragraph position="7">  This contrasts with modular generators. Modularity is surprisingly pervasive. Even generators which are unified in some respects are modular in other respects. The generators with uniform processing approaches, including Appelt's planning generator/Appelt 1985/and Kalita's connectionist generator/Kalita and Shastri 1987/, employed levels of representation. Jacobs' KING exploited a uniform representation but relied on diverse algorithms and processes/Jacobs 1985b/. In addition, most generators partition knowledge into separate knowledge bases for dictionary, world knowledge, grammar rules.</Paragraph>
    <Paragraph position="8"> The problem with modular design is that it does not support the flow of information between modules. This makes it hard for them to handle interactions between factors of differont types. For example, the distinction between strategy and tactics requires an interface protocol between the two modules. This interface usually consists of a description of the information passed between the two. This information is v ariou sly called a ' 'message,&amp;quot; ' ' meaning,&amp;quot; &amp;quot;content,&amp;quot; or &amp;quot;realization specification\]' Many have pointed out, however, that such a &amp;quot;message&amp;quot; can not contain enough information/Appelt 1985//Danlos 1984/ /Hovy 1987L In particular, even seemingly mundane choices of words can be sensitive to the speakers goals. The underlying problem is that researchers have partitioned the problem in order to study it, which is reasonable; but they have also imposed partitions on the designs for generators, which is unjustified.</Paragraph>
    <Paragraph position="9"> Of course it is impossible to prove that modular designs are inadequate. They can always be augmented with special pathways and protocols for the flow of information among modules. However, it is not obvious that patchwork design is unavoidable.</Paragraph>
    <Paragraph position="10"> Design Principle 3: Do not rely on the details of the structure of the input.</Paragraph>
    <Paragraph position="11"> The input to FIG is a structure of linked, activated nodes. These nodes are the ultimate source of energy that drives the entire generation process. However, there is no simplistic correspondence between input and output. This contrasts with generators which are designed around a well-elaborated notion of the input.</Paragraph>
    <Paragraph position="12"> Most generators use inputs which are' tailored to make generation easy, which means that they cannot handle inputs which are not &amp;quot;suitable.&amp;quot; This constrains the concepts of the input to correspond to words of English in some fairly direct way. It also constrains the structure of the input to reflect the structure of English. It may constrain the input in other ways, for example, requiring the input to have a distinguished &amp;quot;top&amp;quot; node.</Paragraph>
    <Paragraph position="13"> In contrast, FIG is free of the usual constraints on its input. FIG can easily emit words which are not directly related to the input, since choices are determined by spreading activation, which can come from diverse sources and follow long paths. Also, FIG builds up the structure of the output incrementally as a side effect of emitting words. The only constraint that FIG imposes is that the input support activation flow. Thus it isflexible in that it can handle a wide variety of inputs. This contrasts with the usual practice of fixing an input format mad insisting that anyone desiring to use the generator conform or write a pre-processor. The advantages of flexible generation for machine translation are obvious.</Paragraph>
    <Paragraph position="14"> Design Principle 4: Treat most choices as emergent.</Paragraph>
    <Paragraph position="15"> FIG does not explicitly &amp;quot;choose&amp;quot; concepts or syntactic structures. Such choices are unnecessary. The only explicit choices needed are the successive choices of words.</Paragraph>
    <Paragraph position="16"> The appearance of syntactic choice emerges from the fact that constructions affect the form of the utterance. An analyst can, of course, look at an utterance and think &amp;quot;this exhibits the choice of construction X.&amp;quot; However, FIG never actually explicitly chose X (although the node for X was probably highly activated and played an important role in the flow of activation). This contrasts with generators  which explicitly make syntactic decisions, such as which template to use, which edge to traverse, how to order words, or whether to include or omit an optional constituent.</Paragraph>
    <Paragraph position="17"> The appearance of concept choice emerges from the fact that words are associated with concepts, and so a word choice can imply the choice of a concept.</Paragraph>
    <Paragraph position="18"> Choice among words is also emergent in FIG. For example, it never chooses between &amp;quot;a&amp;quot; and &amp;quot;the.&amp;quot; The fact that &amp;quot;a&amp;quot; and &amp;quot;the&amp;quot; are in complementary distribution in English is represented with an inhibitlink between the nodes '&amp;quot;a&amp;quot; and '&amp;quot;the&amp;quot;. Thus, whenever one of these is activated the other receives negative energy. When generating, therefore, the network tends to settle into a state where one, but not both, of these nodes is highly activated. And thus typically only one of these words is selected. This is how FIG &amp;quot;chooses&amp;quot; between &amp;quot;a&amp;quot; mad &amp;quot;the,&amp;quot; without treating them as explicit alternatives.</Paragraph>
    <Paragraph position="19"> The problem with explicit choices is ordering them. It is hard, if not impossible, to fix an order such that no choice is made before a choice which it depends on/Danlos 1984/.</Paragraph>
    <Paragraph position="20"> At this point I should acknowledge how subversive this approach really is. My guiding principle has been &amp;quot;word choice suffices.&amp;quot; Intuitively, if every word choice is appropriate, then the whole utterance will be appropriate, by induction. Therefore it seems reasonable to study syntax mad meaning in generation by focusing on the ways they affect word choice.</Paragraph>
    <Paragraph position="21"> This contrasts starkly with most generation research, which seems to assume that &amp;quot;syntax constrains the problem of generation so well that word choice should be treated as an afterthought.&amp;quot; In particular, the principle of emergent choice allow one to dispense with some things that generators are usually supposed to do. First, FIG does not produce a parse tree for a sentence while generating. I prefer to think of constmctions existing in the generator during the production of a sentence rather than existing in the resulting utterance. In FIG many constructions are simultaneously active during production, with no mechanism other than spreading activation to unify or coordinate them.</Paragraph>
    <Paragraph position="22"> Second, FIG is not guaranteed to produce only grammatical utterances.</Paragraph>
    <Paragraph position="23"> I contend that grammaticallty has been overemphasized. Output which is grammatically correct is not necessarily more understandable than fragmented, ungrammatical output.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML