File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/85/e85-1027_metho.xml

Size: 10,336 bytes

Last Modified: 2025-10-06 14:11:44

<?xml version="1.0" standalone="yes"?>
<Paper uid="E85-1027">
  <Title>A Computational Theory of Prose Style for Natural Language Generation</Title>
  <Section position="4" start_page="189" end_page="189" type="metho">
    <SectionTitle>
4. An Example
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="189" end_page="189" type="sub_section">
      <SectionTitle>
4.1 Underlybtg representation
</SectionTitle>
      <Paragraph position="0"> At the present time we are repr~ndug the information about a tribe in a frame language ~,-,owa as ARLO \[I-Iaase 1984\], which it a CommonLitp implementation of RLL. We have no stock in this representation per se, aor, for that matter, in the spec/fic detaiLs of the frames we have built (though we are fairly pleased with both); our system has worked from other representations in the past and we expect to work with still others in the future. Rather, this choice provide, us with an expeditious, non-linguistic source for the articles, which has the characteristic, we expect of modern representations. Figure 2 shows the toplevel ARLO frame for the Ashanti and one of its subframes.</Paragraph>
      <Paragraph position="1">  Given this representation, it is a straightforward matter to define a fixed script that can serve as the m_a~__ge-level source for the paragraphs. We simply list the slots that contain the desired information. 3  2 At presem &amp;quot;preference&amp;quot; is dt.fined by sorting candidate point-choice pair,, ~r_at~t the rules and selecting the topmost one; it i,, easy to seC/ hi. lem C/omlmtationally intem~ zhemm could be worked out. SOI~ ~tylist~ ~ should probably be allowed to &amp;quot;veto&amp;quot; whole c!=t,~ of attachment points and others able to declare themselves atways the best. Furthermore these ndm naturally fall into groups by specialization and features held in common, sugges~ag that the &amp;quot;sort&amp;quot; operation co~,.' be sped up by tal~g advantage of that m'ucture in the algorithm rather than simply sorting against all of the stylistic rules twiformly. We have worked out on papn, ho~, r,w.h alternatives would go, and expect to implement them later this ye~'.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="189" end_page="192" type="metho">
    <SectionTitle>
3 In ARLO slot.s are first-.cb,.~ objects with a protot~e hierarchy
</SectionTitle>
    <Paragraph position="0"> oC/ their own just like the on(c) for units (frame,). The list of dot,, is cffect~ely a list of a~ functions whmc domain is units (the re'be being descn~oed) and whose range is also units (the slot values).</Paragraph>
    <Paragraph position="1"> Wh~ this script /s instamiated, the generator will receive a list of 3-,,.~;c records: slot. unit. and value.</Paragraph>
    <Paragraph position="2">  If any of these slots are empty or &amp;quot;not interesting&amp;quot; for the tribe, it is simply left out. The interface between planner and realization can be this simple uecause the type of text we axe generating is fairly programmatic and predictahle.</Paragraph>
    <Paragraph position="3"> With a more compficated task comes a more mphisticated planner. The point here, however, is to examine a simple planning domain in order to isolate those decisions that axe purely stylistic in nature.</Paragraph>
    <Section position="1" start_page="190" end_page="192" type="sub_section">
      <SectionTitle>
4.2 Attaehmellg
</SectionTitle>
      <Paragraph position="0"> TO illustrate what attachment adds, let us tint look what the usual alternative procedure, direct trandat/on, 4 would do with the information plan we use for these paragraphs. It would realize the items in the script one by one, maintaining the given order, and the resulting text would look like this (assuming the system had a reasonable command of pronominalization): The Ashanti are an African people. They live in central Ghana and neighboring regions of Togo and Ivory Coast. This is in West Africa. Their population is more than 900~00.</Paragraph>
      <Paragraph position="1"> They ~eak the language Akan. They ~ub~ pr/mar/ly by farming cacao. ThL~ is a major cash crop.</Paragraph>
      <Paragraph position="2"> Figure 5 Paragraph II by Direct Replacement Although true to th(c) information in the script, this method does not refiet.t the complex stylistic variations and enrichments that make up the original paragraph. There must be something above the level of a single information unit to coordinate the flow of text, while not altering the intentions or goals of the planner. With this in mind, we have built a stylistic controller which has the following properties: o It allows information to be &amp;quot;folded in&amp;quot; to already planned text. Items in the script do not necessarily appear in the same order in the text.</Paragraph>
      <Paragraph position="3"> o The decision about when to fold things in is made on the barn of style; i.e. if the style had been different, the text would have been different as well.</Paragraph>
      <Paragraph position="4"> o The points where new material may be added to planned text are defined on structural grounds.</Paragraph>
      <Paragraph position="5"> For example, notice that in paragraph 1I from Figure I the language-field is realized as as a compound adjectival phrase, modifying the prototype; viz. &amp;quot;Akan-speaking.&amp;quot; For the first article, however, the language-field is realized differently. The attachment-point that allows this &amp;quot;fold-in&amp;quot; (i.e. attach-as-adjective) is introduced by the realization class for the prototype field. The decision to select this phrase over the sentential form in Figure 5 is made by a styllst/e rule. This rule (cf. Figure 6) states that the adjectival form is preferred if the language name has its own encyclopedia entry. 5 We see that this stylistic rule is no* satisfied in Paragraph I, hence another avenue must be taken (namely, clausal). The other attachment points used by the stylistic rules determine whether to use a reduced relative clause, a new sentence, or perhaps an ellipsed phrase. The stylistic rule allowing this structure is given below in Figure 6.</Paragraph>
      <Paragraph position="6">  structure is the prototype field--the essential attribute of the object. This introduces, as mentioned above, an attachment point on the NP aoo:~, allowing additional information to be added to me surface structure. The realization class as,soctated with the language field for the Ashanti is ~e-verb, represented in Figure 7 below.</Paragraph>
      <Paragraph position="7">  cncyclooodias. The rule, however, b to the point, *rid appears to be productive; e.g. &amp;quot;wheat f*rme.~&amp;quot;, &amp;quot;town dwellers&amp;quot;, etc.  Because of the stylistic rules, the compotmd-ad~ctival form is preferred. The preconditions are satisfied.namely, Akan is itself an entry in the encyclopedia-- and the attachment is made. Figure 8 shows the structure at the point of  Two earlier projects are quite close to our own though for complementary reasons. Derr and McKeown \[1984\] produce paragraph length texts by combining individual information units of comparable complexity to our own, into a series of compound sentences interspers~ with rhetorical connectives. Their system is an improvement over that of Davey \[1978\] (which it otherwise closely resembles) because of its sensitivity to dLseours~level influences such as focus.</Paragraph>
      <Paragraph position="8"> The standard technique for combining a sequence of conceptual units into a text has been &amp;quot;direct replacement&amp;quot; (see discussion in Mann et al. \[1982\]), in which the sequential organization of the ~ex~ is identical to that of the message because the mesmge is used directly as a template. Our use of attachment dramatically improves on this technique by relieving the message planner of any need to know how to organize a surface structure, letting it rely instead on explicitly stated stylistic criteria operating after the planning is completed.</Paragraph>
      <Paragraph position="9"> Derr and McKeown \[1984\] also improve on direct replacement's one-proposition-for-one-sentence forced style by permitting the combination of individual information units (of comparable compiexity to our own) into compound sentences interspersed with rhetorical connectives. They were, however, limited to extending sentences only at their ends, while our attachment procem can add units at any grammatically licit position ahead of the po'mt of speech. Furthermore they do not yet express combination criteria as explicit, separable rules.</Paragraph>
      <Paragraph position="10"> Dick Gabriel's program Yh \[1984\] produced polished written texts through the use of critics and repeated editing. It maintained a very similar model to our own of how a text's structure can be elaborated, and produced texts of quite high fluency. We differ from Gabriel in trying to achieve fluency in a single online pass in the manner of a person talking off the top of his head; this requires us to put much more of the responsibility for fluency in the we-linguistic text planner, which is undoubtedly subject to limitations.</Paragraph>
      <Paragraph position="11"> It is our belief that, for script-like domains, online text generation suffices. This method, in fact, provides us with an interesting diagnostic to test our theory of style: namely, that stylistic rules are meaning-pre~rving, and do not change the goals or intentions of the speaker. Stylistic rules are to be distinguished from those syntactic rules of grammar which affect the semantic interpretation of a syntactic expression. A non-restrictive relative, for example, is a partictdar stylistic construction that adds no meaning-delimiting predication to the denotation of the NP. Use of a restrictive relative, on the other hand, is not a matter of style, but of interpretation; &amp;quot;the man who owns a donkey&amp;quot; is not a stylistic variant of the proposition &amp;quot;The man owns a donkey.&amp;quot; In other words, the stylLqic component has no reference to intentions, goals, focus, etc.  These are the concerns of the planner, and are expressed in its choices of information units and their description (cf. Mann and Moore \[1983\] for a discussion of similar concerns).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML