File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/90/w90-0116_abstr.xml

Size: 31,257 bytes

Last Modified: 2025-10-06 13:47:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="W90-0116">
  <Title>The Local Organization of Text</Title>
  <Section position="1" start_page="0" end_page="126" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper, I present a model of the local organization of extended text. I show that texts with weak rhetorical structure and strong domain structure, such as descriptions of houses, digital circuits, and families, are best analyzed in terms of local domain structure, and argue that global structures that may be inferred from a domain are not always appropriate for constructing descriptions in the domain. I present a system I am implementing that uses short-raTtge strategies to organize text, and show how part of a description is organized by these strategies. I also briefly discuss a model of incremental text generation that dovetails with the model of local organization presented here.</Paragraph>
    <Paragraph position="1"> Motivation for local organization The approach to organizing extended text described here has both psychological and computational motivation. It aims both to model how people use language and to provide a flexible architecture for a system's language use. In this section, I describe the empirical data that form the basis of this research, and characterize the local organization of the collected texts. In the next two sections, I describe a computational architecture to implement local text organization and discuss its advantages of generality and flexibility, and give an example of how this architecture works.</Paragraph>
    <Paragraph position="2"> An extended text has a structure; this structure is a description of how the components relate so that sense can be made of the whole. Two sources of this organization are rhetoricial structure, which describes the way elements of the text fit together, and domaitt structure, which describes relations among domain objects. For this research I chose three domains with strong domain structure, and a task--description--with weak rhetorical structure. I have tape-recorded 29 people giving descriptions of house layouts, electronic circuit layouts, and family relationships. Description fragments of a house and of a family, and the questions asked to obtain the descriptions, are given in figure 1. (Because of space considerations, the fragments are somewhat abbreviated.) null Many approaches to text organization 1 are based on analyses of text in terms of rhetorical structure.</Paragraph>
    <Paragraph position="3"> However, there are few segments of text with interesting rhetorical structure in my corpus. For example, an analysis of the texts using Mann and Thompson's (1987) Rhetorical Structure Theory (RST) would result primarily in the relations sequence and joint and would contain few of the the relations like evidence or justify that give RST its descriptive power.</Paragraph>
    <Paragraph position="4"> Similarly, it is unclear what work a system like that of Grosz and Sidner (1986) would do in analyzing a description. null Since the structure of descriptions cannot be analyzed adequately with rhetorical relations, perhaps it can be explained in terms of the domain. Houses, chips, and families are strongly structured. A family's relationships can be captured in a family tree; one might suppose that a description of the family would also be organized in this way. A house can be encoded in a number of ways; for instance, it has a component hierarchy, being composed of rooms composed of furnishings. Linde (1974) has proposed another comprehensive structure for houses: a phrase structure grammar that determines how the rooms may be visited in a traversal of a house layout.</Paragraph>
    <Paragraph position="5"> Surprisingly, these global, hierarchical domain structures are not exploited in the organization of descriptions in my corpus. While family trees and composition hierarchies can be inferred from descriptions of families and houses, this does not mean that these structures guide the process of organizing them. For instance, my family informants did not simply construct their descriptions by starting at the root of the appropriate tree and doing a depth-first or breadth-first traversal of it. Instead, to select a next family member to talk about, they would apply one of several criteria. Generally, a sibling, spouse, parent, or child would be the next choice, and this might incidentally constitute part of a tree walk. But that this choice is local is evidenced by the next choice, which may not be construable, in any XWhat I call text organization is usually referred to as text planning.</Paragraph>
    <Paragraph position="6">  on our righthand side would be a door, which leads to Penni's room and you walk in there...</Paragraph>
    <Paragraph position="7"> and there are two windows, in the...opposite corner from the one in which you enter one' s on the lefthand wall, and one's on the wall that you would be facing then, on the righthand side...of her room, is the closet Fragment of a house description.</Paragraph>
    <Paragraph position="8"> In response to the question: &amp;quot;Could you please describe for me the layout of your house.&amp;quot; there's my mother Katharine, my father John, my sister Penni, and me it's my mother's relatives that we go to see Margaret and Bill, who are Mommy' s...</Paragraph>
    <Paragraph position="9"> urn...Margaret's my great-aunt so it must be Mommy's aunt Fragment of a family description.</Paragraph>
    <Paragraph position="10"> In response to the question: &amp;quot;Can you tell me how everyone who comes to Thanksgiving is related to each other?&amp;quot;  principled way, as part of the overall structure of the description that one might have postulated at the previous step. Where to begin the family description also appears to be a locally conditioned choice. Informants begin at various points, such as themselves or long-dead progenitrixes, but the majority start their description of the family by mentioning the hostess and host of the Thanksgiving dinner they are attending; we may suppose that at a different time of year the descriptions are likely to start off differently.</Paragraph>
    <Paragraph position="11"> Further evidence that people do not structure their descriptions using obvious global domain structures may be adduced from examples in which speakers explicitly deny knowledge of such structures, as in the following fragment.</Paragraph>
    <Paragraph position="12"> and also...tun...</Paragraph>
    <Paragraph position="13"> Eleanor and Elizabeth come who are...cousins of...all of us...um...</Paragraph>
    <Paragraph position="14"> I don't know what generation cousins they are Here, the speaker shows by her description of two family members that she does not know her relationship to them, even though it would be clear if a family tree were being used to organize the description (the women in question are in fact first cousins twice removed of the speaker).</Paragraph>
    <Paragraph position="15"> Genealogical trees, phrase structure grammars, and component hierarchies are useful for succinctly representing information about houses, chips, and families But there is no a priori reason to suppose that a description of such things are the products of such easily articulable schemas or grammars. When we examine texts of the sort that we wish to generate, we must distinguish the mechanisms that direct the process of choosing what to say next from a retrospective description of its result.</Paragraph>
    <Paragraph position="16"> The texts I have collected can be best analyzed as locally organized by a process of deciding what to say nezt This decision is based principally on what has already been said and what is currently available to say. For example, if one has just mentioned a large window in the kitchen, one can mention whatever is to the left of the window, whatever is to the right of it, which way it is facing, what it looks like, or how it is similar to a window in another room of the house. If one has mentioned an aunt, one can give her name, say whether she is married, mention her sister, enumerate her children, or talk about how much money she earns.</Paragraph>
    <Paragraph position="17"> The strong domain structure of subjects like houses, chips, and families ensures that a description can be continued from any point: once a description has been started, there is always something, often many things, that can be said next. In structured domains, there is always a default choice for the next thing to say. In spatial domains like houses, spatial proximity provides this default. Everything in a house, be it a room, a wall, or a kitchen appliance, is next to something else. Spatial proximity does not constrain house descriptions,  but it ensures that a description does not come to a premature dead end.</Paragraph>
    <Paragraph position="18"> Descriptions are finite. Though there may always be more to say, there are points at which a description may stop, when the task may be considered accomplished. Linde (1974) proposed s completeness criterion for house descriptions, which is reflected in my data as well as hers. It states that a description may stop any time after all the rooms have been mentioned, but it is not complete if it stops before. A similar criterion holds in the family descriptions collected: they were given in answer to the question, &amp;quot;Can yoti tell me how everyone who comes to Thanksgiving is related to each other?&amp;quot; In this case, then, the criterion is mentioning everyone who attends.</Paragraph>
    <Paragraph position="19"> Knowing how to continue and knowing when to stop together ensure that a description can be generated depending solely on local organization. The strong domain structure of houses and families makes the working of these mechanisms for continuation and termination particularly clear, and thus these domains are a good site for studying this approach to organizing text. However, the local organization of text is also evident in many other uses of language. People's conversation is often locally organized (Levinson, 1983); some interactive systems are currently being designed with this approach (Frohlich &amp; Luff, 1989).</Paragraph>
    <Paragraph position="20"> Because I am interested not only in how a program may organize text but also in how people do so, I study people speaking rather than people writing. A written text may be edited and reorganized, and this process often involves explicitly thinking about rhetorical structure. A spoken description is likely to require that the speaker organize her text locally--she cannot plan it out ahead of time. Studying spoken text reveals more of the underlying mechanisms of language, because time constraints and the inability to edit what has already been said make post-processing impossible.</Paragraph>
    <Paragraph position="21"> Computational architecture I am implementing a system that employs local organization of text as described in this paper. The implementation comprises: a semantic net knowledge base; an organizer composed of strategies and metastrategies; and a generator. Local organization is achieved using short-range strategies, each of which is responsible for organizing only a short segment of text, between a word and a clause in length.</Paragraph>
    <Paragraph position="22"> Until recently, I have employed Mumble-86 (Meteer et al., 1987) as the generator for this system. Construction is underway, however, on a simpler generator that more accurately implements the principles of generation implied by the structure of the organizer. The system is currently implemented only for descriptions of houses; the examples in this section will thus be drawn from the house domain.</Paragraph>
    <Paragraph position="23"> The knowledge base is a semantic net that encodes the objects, properties, and relations of a domain. The organizer keeps a pointer to the current node in the description. A strategy describes the current node and others related to it via local connections in the network. Strategies are selected sequentially, based on local conditions; if there is no strategy available or if more than one strategy is appropriate, metastrategies (Davis, 1980) resolve the conflict, using a technique similar to universal subgoaling in Soar (Laird, Newell Rosenbloom, 1987). Metastrategies are responsible for control: they sequence and combine strategies. Like strategies, they can cause the production of text.</Paragraph>
    <Paragraph position="24"> This architecture has several advantages. First, it is flexible: because there is no fixed priority scheme and because strategies are small, the strategies can be combined in a variety of ways to produce texts that are constrained only by the appropriateness of each strategy as determined by the strategies' interaction with the knowledge base. Second, the architecture is extensible: new strategies can easily be added to extend the organizer to different types of text. Finally, the organizer is mainly domain-independent: while some strategies may be particular to houses, most strategies are not.</Paragraph>
    <Paragraph position="25"> The strategies are applied to the knowledge base and select the items that make up the description. The strategies find the appropriate lexical items(s) for each knowledge base item that is expressed; these lexical items and the knowledge base items themselves are determined by the domain. While the strategies are simple, complex behavior emerges from their interaction with the knowledge base; this locally organizes the extended text.</Paragraph>
    <Paragraph position="26"> Each strategy falls into one of four classes, with varying degrees of domain independence: discourse cue strategies; linguistic strategies; parameterizable domain-independent strategies; and semi-domain-independent strategies. Figure 2 gives examples of each.</Paragraph>
    <Paragraph position="27"> Of the domain-independent strategies, discourse cue strategies focus attention, in a way similar to the clue words described by Reichman (1985), and linguistic strategies mention objects and associated properties.</Paragraph>
    <Paragraph position="28"> mentlon-sallent-object, used to say &amp;quot;there is a window,&amp;quot; may as easily express &amp;quot;there is a penguin&amp;quot; or &amp;quot;there is a policy.&amp;quot; describe-object is similarly allpurpose, and can produce &amp;quot;the window with two flanking windows&amp;quot; or &amp;quot;the man with one black shoe.&amp;quot; The parameterizable domain-independent strategies have slightly different textual realizations in different domains, but these differences can be captured by parameters. The example given in figure 2 is typical: the strategy is realized as a prepositional phrase in each domain, and only the preposition changes.</Paragraph>
    <Paragraph position="29"> The semi-domain-independent strategies accomplish tasks such as a sweep that seem particular to the domain, but are similar to tasks in other domains. A sweep begins at an object and names another bearing some spatial relationship to it, and then another object  mentlon-sallent-object &amp;quot;there is z&amp;quot; &amp;quot;we have z&amp;quot; describe-object &amp;quot;the z is ~,&amp;quot; &amp;quot;the y z&amp;quot; &amp;quot;the z (which) has z&amp;quot; &amp;quot;the z with z&amp;quot;  situate &amp;quot;in the kitchen&amp;quot; &amp;quot;during the morning&amp;quot; &amp;quot;about the election&amp;quot;</Paragraph>
    <Section position="1" start_page="122" end_page="125" type="sub_section">
      <SectionTitle>
Semi-domain-independent Strategies
</SectionTitle>
      <Paragraph position="0"> sweep Enumerate objects connected each to the next by the same relation.</Paragraph>
      <Paragraph position="1"> follow a path Traverse a natural connection between parts of the knowledge to be described.  that bears the same relationship to the just-mentioned one, until there are none left; for example, &amp;quot;to the left of the window is a stove and then a refrigerator&amp;quot; is a sweep-left. Similar constructions may be found in other domains. A description of one's day may start at some event and mention the next event and the event after that. In this case, the relationship is temporal, rather than physical.</Paragraph>
      <Paragraph position="2">  speaker's propensity to mention objects to the fight before objects to the left.) It is here that anything considered a global goal would be encoded.</Paragraph>
      <Paragraph position="3"> * The completeness criterion.</Paragraph>
      <Paragraph position="4"> In future implementations, the context may also involve some model of the hearer. This would be part of the local context: what one knows about one's hearer and about what one's hearer knows changes, particularly under the assumption that hearer and speaker are engaged in a two-way interaction.</Paragraph>
      <Paragraph position="5"> A strategy conflict set contains whatever strategies are currently applicable. If it contains a single strategy, that one is selected. If more than one is in the set, met&amp;quot;strategies may resolve the conflict by selecting one or by combining some or all of them. Finally, if no strategy presents itself, the met,strategies apply a default strategy.</Paragraph>
      <Paragraph position="6"> The met&amp;quot;strategy find-&amp;quot;interesting&amp;quot;-llnk is triggered after the introduction of a new topic, which has links to many other items in the knowledge base. What is &amp;quot;interesting&amp;quot; or salient depends on: * The domain: In spatial description, objects that are large or have many features are interesting.</Paragraph>
      <Paragraph position="7"> * The structure of the domain: Objects that are more connected to other objects are more interesting. * The local context: If a window has just been mentioned, there is reason to mention other windows. * The global context: There may be an inclination to mention furnishings but not structural features of the house, or vice versa; there may be differing levels of detail required.</Paragraph>
      <Paragraph position="8">  Some metastrategies combine strategies. Such metastrategies apply when several strategies are appropriate at some point in the description and there is a felicitous way to combine them. clrcular-sweep is an example: it combines a number of sweep strategies (sweep-left, sweep-right, and sweep-under), and includes additional orientation strategies to orient hearers between sweeps, kltty-corner is a metastrategy that is used to describe s room in which the most salient feature is diagonally opposite the current location. The object that is &amp;quot;kitty corner&amp;quot; from it is mentioned first, and the rest of the room is described in relation to it. 2 The first fragment in figure 1 exemplifies the kltty-corner metastrategy.</Paragraph>
      <Paragraph position="9"> find-new-topic is the default metastrategy just in case there is nothing &amp;quot;interesting&amp;quot; to say. In a spatial domain, the default is to select an object to describe using spatial proximity.</Paragraph>
      <Paragraph position="10"> Example of description organization In this section, I describe the organization of a fragment of a description in my corpus. While the system is not yet fully implemented to handle all the details, this example is sufficiently complex to show the operation of the architecture in selecting appropriate strategies and metastrategies. Though the strategies used are simple, complex choices, varying with context, are made. On the following page are the text and a sketch of the area being described.</Paragraph>
      <Paragraph position="11"> In the fragment in figure 3, the speaker is describing the bedroom he shares with his wife Carol. Each line of the fragment, in most cases, is the result of a single strategy.</Paragraph>
      <Paragraph position="12"> The sketch in figure 4 of the bedroom is provided as an aid to the reader in understanding John's description. There is no corresponding representation in the system.</Paragraph>
      <Paragraph position="13"> The global context used by this speaker includes parameters that predispose him to mention rather than ignore room furnishings, as well as &amp;quot;stuff&amp;quot;--small articles than can be found in, on, or near pieces of furniture. The strategy mention-stuff is a particular form of descrlbe-object, and as such is concerned with mentioning associated properties of the object rather than physical proximity; this is suggested by the speaker's typically using the preposition &amp;quot;with,&amp;quot; rather than an obviously spatial one.</Paragraph>
      <Paragraph position="14"> The global context also of course includes the completeness criterion which is unsatisfied throughout this stretch of text. The fragment starts when the speaker has just finished describing the kitchen and the next spatially proximate thing is the door to the bedroom.</Paragraph>
      <Paragraph position="15"> When the node to be described is a physical object, 2For a fuller treatment of how the system computes deictic and other spatial terms Uke &amp;quot;kitty corner,&amp;quot; &amp;quot;left,&amp;quot; and &amp;quot;right,&amp;quot; see (Sibun &amp; Huettner, 1989).</Paragraph>
      <Paragraph position="16"> the mention-object strategy is always available; because this object is a door, mention-room is available to talk about the room that the door leads into. The metastrategies resolve this conflict in favor of the more particular mention-room {1}.</Paragraph>
      <Paragraph position="17"> Because the last strategy used was mentlon-room, mention-salient-object becomes available; if there is a particularly &amp;quot;salient&amp;quot; object in a room, descriptions can then be organized around it. As it happens, salience in this domain tends to depend primarily on size; the kingsize bed fits the bill {2}. There are two objects spatially proximate to the bed; one is selected and the strategy mentlon-object is used to mention the endtable {3}. The endtable is connected to several other items in the knowledge base, but because there is a context parameter to mention &amp;quot;stuffs&amp;quot; this is what happens next {4}.</Paragraph>
      <Paragraph position="18"> Now, there are two unmentioned objects spatially proximate to the endtablemthe window and the &amp;quot;wall&amp;quot; (which is actually a covered-up chimney). The &amp;quot;wall&amp;quot; is mentioned because, like furnishings, it has extent in the room, and the context disposes the process toward mentioning such objects {5}. The local context keeps track of what has just been mentioned; another feature it records is the direction in which the spatial proximity finks have been followed. The &amp;quot;wall&amp;quot; is next along the trajectory from the bed through the endtable.</Paragraph>
      <Paragraph position="19"> The &amp;quot;wall&amp;quot; is spatially linked to three things: the window on the endtable side of it, the window that is along the same trajectory that has been followed, and the chests, which are also along that trajectory. The window along the trajectory is the choice selected from these three for two reasons: there is a tendency to maintain trajectories; 3 and the last thing mentioned is part of the structure of the room, as is the window, but not the chests. But this selection of the window presents a problem (at least, we can infer that it presents a problem to the speaker), because this window is connected via a similarity link to the previous window, which has not been mentioned. So the speaker performs a repair consisting of backing up, mentioning the overlooked window, reiterating mention of the wall, and, finally, mentioning the selected window {6}.</Paragraph>
      <Paragraph position="20"> While the next window is a promising candidate because it is the same sort of thing as that just mentioned, the parameter for mentioning furniture overrides this, and the cedar chests come next in the description {7}, followed by their associated &amp;quot;stuff&amp;quot; {8}. The window is again available but so is the bureau, which is the preferred choice because it is furniture {9}. The bureau has spatial proximity links to two windows and the &amp;quot;small thing,&amp;quot; as well as links to its &amp;quot;stuff,&amp;quot; so the set of available strategies comprises ones that mention each of these things. A metastrategy can resolve this 3Ullmer-Ehrich (1982) notes a similar tendency in the dorm room descriptions that she collected.</Paragraph>
      <Paragraph position="21">  and then there's the bedroom {1} and there's a huge kingsize bed {2} and there's an endtable next to it {3} with a lamp and a clock rad/o {4} and some stuff of mine like the Boston University cup and...then there's a wall-- {5} and then there's a window behind that {6} and there's a wall, and there's another window and there's some...cedar chests of Carol's {7} that have blankets and sheets in them {8} and...there's her bureau {9} in the middle of two windows on either side {10} with all of her makeup on top of it and clothes and there's...a small thing with all her clothes and there's another great big bookshelf and all her spare books and there's a small end table over on her side um...a small digital clock and more kleenex  conflict by realizing that strategies for saying that there is an object of the same sort on two different sides of the current object can be combined by saying that the bureau is between the two windows {10}. 4 The description for the rest of the room continues in a manner similar to that already discussed.</Paragraph>
    </Section>
    <Section position="2" start_page="125" end_page="126" type="sub_section">
      <SectionTitle>
Incremental generation
</SectionTitle>
      <Paragraph position="0"> The organizer described here is composed of strategies that are often responsible for sub-clausal units of text; furthermore, the strategies have already imposed an organization on the text, obviating the need for the generator to do more than enforce gramma~icalitT/. The model of local text organization I van developing is coupled with a model of incremental generation, in which the increments are often smaller than a sentence. (Gazrett's investigation of speech errors (1975) constitutes early work in this area; De Smedt &amp; Kempen (1987), and Kempen &amp; Hoenkamp (1987) discuss a similar, more fully-developed incremental generation project.) A typical generator produces a sentence at a time. However, spoken text is replete with restarts, fragments, and ungrammaticalities. This suggests that not only do people organize their text incrementally, but generate it in increments as well.</Paragraph>
      <Paragraph position="1"> Usually, an incremental generator is successful in generating grammatical text. The text then / in the kitchen / there is a large window is the result of three sequentially operating strategies: introduce/shltt-attention, situate, and mentionsallent-object. An incremental generator will be able to produce this text in the increments specified by the strategies.</Paragraph>
      <Paragraph position="2"> A system that generates in strategy-sized increments can result, in principled ways, in ungrammatical text. A common error, exemplified by and a lot of other people / who / I wasn't quite sure what they did can be explained by the operation of the strategies mention-object, add-clausal-modifier, and a set of strategies to express some additional information, which, because a dependent clause has already been started, happens to result in the ungrammatical resumptive pronoun &amp;quot;they.&amp;quot; Locally-organized, occasionally ungrammatical text may be prototypical, but it is certainly not the only sort 'Note that the previous window conflict was not resolved by saying that the &amp;quot;wall&amp;quot; was between the two windows. The difference can be explained by the observation that the &amp;quot;wall,&amp;quot; despite its extent into the room, is a structural object, wld/e the bureau is furniture.</Paragraph>
      <Paragraph position="3"> of text we wish to generate. To be comprehensive, my system will require some capability to post-process text after it is organized and before it is generated (Hovy, 1989, suggests a post-processing model). Output of my system might also be appropriate input for a text revision system (e.g., Meteer &amp; McDonald, 1986).</Paragraph>
      <Paragraph position="4"> Related research There are many other projects whose goal is to organize, or plan, extended text. The main difference between these and mine is flexibility and level of organization: most text planning systems rely on global structures to organize paragraph-sized text. These structures, which are usually schemas or plans, constrain the text to reflect particular rhetorical and domain relations, but at the expense of flexibility. My system builds structure locally with no recourse or reference to an overall structure. null Through analysis of written texts, McKeown (1985) has developed a small set of schemas, built from rhetorical predicates, that would provide types of descriptions (for example, constituency) for objects in a knowledge base. ParAs has extended this work to use a process trace which derives its structure in part from the domain (Paris &amp; McKeown, 1987; Paris, 1988). This alternative strategy builds and traverses a path through the knowledge base when it is determined, on the basis of a sparse model of user expertise, that a process description of an object is more appropriate than a declarative one; the two strategies may be interleaved. Dale (1989) similarly organizes text by means of a domain structure, in this case, a recipe plan.</Paragraph>
      <Paragraph position="5"> Rhetorical Structure Theory (Mann &amp; Thompson, 1987) is also drawn from an analysis of written texts; it differs from McKeown's work in that it is composed of a large number of rhetorical relations, rather than a small number of schemas. RST may thus be more flexible, but is is still assumed that the relations will be combined into a single global tree covering an extended text.</Paragraph>
      <Paragraph position="6"> While most text planning systems have worked on producing single texts in response to queries, some research has been more particularly concerned with interactive text. Much recent work in this area has been for explanation systems (e.g., Maybury, 1989), and some of this work explicitly addresses allowing a human user to ask follow-up questions (Moore &amp; Swartout, 1989).</Paragraph>
      <Paragraph position="7"> However, such systems still build and use global structures for extended texts.</Paragraph>
      <Paragraph position="8"> An argument is sometimes made that global structure is needed to capture high-level rhetorical goals in the output text (see Appelt, 1985); Gricean Maxims (Grice, 1975) are often invoked. Hut, as Hovy (1988) points out, &amp;quot;be polite&amp;quot; is not a decomposable goal; the objective of being polite is achieved through local decisions. Such local decisions can comfortably be integrated with a model of local organization of text.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML