File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/j96-3006_metho.xml
Size: 31,499 bytes
Last Modified: 2025-10-06 14:14:18
<?xml version="1.0" standalone="yes"?> <Paper uid="J96-3006"> <Title>Toward a Synthesis of Two Accounts of Discourse Structure</Title> <Section position="3" start_page="0" end_page="414" type="metho"> <SectionTitle> 2. Intentional Linguistic Structure in G&S </SectionTitle> <Paragraph position="0"> G&S is formulated in terms of the interdependence of three distinct structures. Of the three structures, it is the effect of intentional structure on linguistic structure that concerns us in this paper. This effect is an explicit claim about ILS.</Paragraph> <Paragraph position="1"> In G&S, the intentional structure consists of the set of the speaker's communicative intentions throughout the discourse, and the relations of dominance and satisfaction-precedence among these intentions. The speaker tries to realize each intention by saying something; i.e., each intention is the purpose behind one or more of the speaker's utterances. Intentions are thus an extension of the intentions in Grice's (1957) theory of utterance meaning. Speakers intend for the intentions behind their utterances to be recognized and for that recognition to be part of what makes their utterances effective.</Paragraph> <Paragraph position="2"> A purpose I,, dominates another purpose In when satisfying I, is part of satisfying Ira. A purpose In satisfaction-precedes another purpose Im when In must be satisfied first. The dominance and satisfaction-precedence relations impose a structure on the set of the speaker's intentions, the intentional structure of the discourse, and this in turn determines the linguistic structure.</Paragraph> <Paragraph position="3"> The linguistic structure of a particular discourse is made up of segments, which are sets of utterances, related by embeddedness and sequential order. A segment DS~ originates with the speaker's intention: it is exactly those utterances that the speaker produces in order to satisfy a communicative intention In in the intentional structure.</Paragraph> <Paragraph position="4"> In other words, I,, is the discourse segment purpose (DSP) of DS,. DSH is embedded in another segment DS,, just when the purposes of the two segments are in the For G&S, dominance in intentional structure determines embedding in linguistic structure. dominance relation, i.e., Im dominates In. The dominance relation among intentions fully determines the embeddedness relations of the discourse segments that realize them. For example, consider the discourse shown in Figure 1, adapted from Mann and Thompson (1988). The whole discourse is a segment, DS0, that attempts to realize I0, the speaker's intention for the hearer to adopt the intention of attending the ballet. 1 As part of her plan to achieve I0, the speaker generates I1, the intention for the hearer to adopt the belief that the ballet will be very entertaining. Then, as part of her plan to achieve I1, the speaker generates I2, the intention that the hearer believe that the show is made up of all new choreography. As shown on the left in Figure 1, I0 dominates I1, which in turn dominates I2, Due to these dominance relations, the discourse segment that realizes I2 is embedded in the discourse segment for I1, which is in turn embedded within the discourse segment for I0, as shown on the right in the figure.</Paragraph> <Paragraph position="5"> The dominance of intentions directly determines embedding of segments.</Paragraph> <Paragraph position="6"> When one DSP In satisfaction-precedes another Ira, then DSn precedes DS,I in the discourse. The satisfaction-precedes relation among intentions constrains the order of segments in the discourse, but it does not fully determine it. In the example in Figure 1, none of the intentions satisfaction-precedes the others.</Paragraph> <Paragraph position="7"> Here we introduce a concept which is not part of the G&S theory, but which will be important to our discussion below. We coin the term core to refer to that part of the segment that expresses the segment purpose. A segment may contain individual utterances as well as embedded segments. Most likely, the core of the segment is found in these unembedded utterances. In the example, (a) is the core of DS0, (b) the core of DS1 and (c) the core of DS2. As will be discussed, a core functions to manifest the purpose of the segment, while the embedded segments serve to help achieve that purpose. The defining feature of the core is its function of expressing the purpose of the segment.</Paragraph> <Paragraph position="8"> While the core's position in the G&S linguistic structure is most likely an unembedded utterance, it is also possible that the core could be an embedded segment. This could occur when the expression of the segment purpose is more elaborate than simply stating what the hearer should do or believe. To simplify our discussion, however, we assume the core of a segment is an utterance not embedded in any subsegment.</Paragraph> <Paragraph position="9"> It should be clear that the theory-independent notion of ILS as it was characterized above is exactly the linguistic structure in G&S. ILS is something G&S makes explicit claims about. By choosing to modify the terminology from simply &quot;linguistic Computational Linguistics Volume 22, Number 3 structure&quot; to &quot;intentional linguistic structure,&quot; we mean to suggest that consideration of something other than speaker intentions--for example, semantic relations--could determine another kind of structure to the discourse. Clearly, the semantic (or informational) relations among discourse entities can in principle be the determinant of a separate linguistic structure. Whether or not such an informational structure is useful or is related in an interesting way to ILS is a question requiring further research. We discuss the relationship between ILS and possible approaches to informational structure briefly in Section 5.</Paragraph> <Paragraph position="10"> 3. Intentional Linguistic Structure in RST In contrast to its explicitness in G&S, ILS is only implicit in RST. To identify the implicit claims about ILS, we must first identify the components of an RST analysis that involve a judgement about the relation between intentions underlying text spans.</Paragraph> <Paragraph position="11"> The range of possible RST text structures is defined by a set of schemas, which describe the structural arrangement of spans, or text constituents. Schemas are basic structural units or patterns in the application of RST relations. There are five schema patterns, each consisting of two or more spans, a specification of each span as either nucleus or satellite, and a specification of the RST relation(s) that exist between these spans. In this paper, we focus on the most commonly occurring RST schema, which consists of two text spans (a nucleus and a satellite) and a single RST relation that holds between them. The nucleus is defined as the element that is &quot;more essential to the speaker's purpose,&quot; while the satellite is functionally dependent on the nucleus and could be replaced with a different satellite without changing the function of the schema. As we argue below, this functional distinction between nucleus and satellite is an implicit claim about ILS, and is a crucial notion in understanding the correspondence between RST and G&S.</Paragraph> <Paragraph position="12"> A schema application describes the structure of a larger span of text in terms of multiple constituent spans. Each of the constituent spans may in turn have a structure of subconstituent spans. Thus, the application of RST schemas in the analysis of a text is recursive, i.e., one schema application may be embedded in another. To be an acceptable RST analysis, there must be one schema application under which the entire text is subsumed and which accounts for all minimal units, usually clauses, of the text. In addition, each minimal unit can appear in exactly one schema application, and the spans constituting each schema application must be adjacent in the text. These constraints guarantee that a correct RST analysis will form a tree structure.</Paragraph> <Paragraph position="13"> An instantiated schema specifies the RST relation(s) between its constituent spans.</Paragraph> <Paragraph position="14"> Each relation is defined in terms of a set of constraints on the nucleus, the satellite, and the nucleus-satellite combination, as well as a specification of the effect that the speaker is attempting to achieve on the bearer's beliefs or inclinations. An RST analyst must judge which schema consists of RST relation definitions whose constraints and effects best describe the nucleus and satellite spans in the schema application. Mann and Thompson claimed that, for each two consecutive spans in a coherent discourse, a single RST relation will be primary. For reasons discussed in Section 5.1, we consider only the RST presentational relations, or what Moore and Pollack (1992) call intentional relations, in identifying the ILS claims of RST.</Paragraph> <Paragraph position="15"> To illustrate how a speaker's intentions determine discourse structure in this theory, consider the RST analysis of the example discourse from Figure 1. As shown in Figure 2, at the top level, the text is broken down into two spans: (a) and (b-c). The span (b-c) forms a satellite that stands in a motivation relation to (a). This span can The RST structure assigned to the example discourse in Figure 1.</Paragraph> <Paragraph position="16"> be further broken down into the two minimal units (b) and (c), where (c) is a satellite that stands in an evidence relation to (b). 2 While there is no direct representation of intentions in RST, the asymmetry between a nucleus and its satellite originates with the speaker's intentions. The nucleus expresses a belief or action that the hearer is intended to adopt. The satellite provides information that is intended to increase the hearer's belief in or desire to adopt the nucleus. Implicitly, this is a claim that the text is structured by the speaker's intentions and, more specifically, by the difference between the intention that the hearer adopt a belief or desire expressed in a text span and the intention that a span contribute to this adoption. In the example, the nucleus (a) expresses an action that the speaker intends the hearer to adopt. The satellite (b-c) is intended to facilitate this adoption by providing the hearer with a motivation for doing the suggested action. In the embedded span, the nucleus (b) expresses a belief that the speaker intends the hearer to adopt and the satellite (c) is intended to facilitate this adoption by providing evidence for the belief.</Paragraph> <Paragraph position="17"> The second implicit RST claim about ILS is a refinement of the first. The intentional relations specify the ways in which a speaker can affect the hearer's adoption of a nucleus by including a satellite. That is, not only is there a functional distinction between nucleus and satellites, there is also a classification of satellites according to how they help achieve the hearer's adoption of the nucleus. Translating this into a claim about ILS, text is structured by the ways in which some utterances are intended to help other utterances achieve their purpose.</Paragraph> <Paragraph position="18"> 4. Correspondence between Dominance and Nuclearity Now we are in a position to compare the explicit claims of G&S about ILS with the implicit ones of RST. Both theories agree that a discourse is structured into a hierarchy of non-overlapping constituents, segments in G&S and spans in RST. Each subconstituent may in turn be structured in exactly the same way as the larger constituent. Superficially, the similarity ends there because the internal structure of segments and spans is different. In G&S, the internal structure of a segment consists of any number of embedded segments plus what we are calling the core, the (usually unembedded) 2 As discussed in Mann and Thompson (1988), a motivation relation occurs when a speaker intends the satellite to increase the hearer's desire to perform the action specified in the nucleus. An evidence relation occurs when a speaker intends the satellite to increase the bearer's belief in the nucleus* Computational Linguistics Volume 22, Number 3 utterances that express the discourse segment purpose. In RST, the internal structure of a span consists of a nucleus, which we have characterized as expressing a belief or action the hearer is intended to adopt, a satellite, which is intended to facilitate that adoption, and an intentional relation between the nucleus and satellite.</Paragraph> <Paragraph position="19"> If we look more closely at the correspondence between dominance and nuclearity, we find that the structure of spans and segments is nearly identical. Specifically, an embedded segment corresponds to a satellite, and the core corresponds to the nucleus.</Paragraph> <Paragraph position="20"> Or, because G&S do not have the notion of core in their theory, a more accurate characterization of the correspondence would be that the nucleus manifests a dominating intention, while a satellite manifests a dominated intention. That is, dominance in G&S corresponds closely to nuclearity in RST. There is a relationship, which we can crudely characterize as that of linguistic manifestation, that links the nucleus to a dominating intention and a satellite to a dominated intention. Exactly how to derive a communicative intention from an utterance, and vice versa, is one of the main research issues in computational linguistics. Here we simply assume that an utterance conveys either a belief or an action p and thereby makes manifest the speaker's intention that the hearer adopt belief in or an intention to perform p.</Paragraph> <Paragraph position="21"> The correspondence suggests a mapping between G&S linguistic structure and RST text structure. An embedded segment in G&S will be analyzed as a satellite in RST, and the segment core will be the nucleus. When there are multiple embedded segments in G&S, each subsegment will be analyzed as an RST satellite. In these cases of multiple subsegments, the RST structure will depend on whether the RST relations are the same or different. The entire segment may be a single RST span with the G&S core as nucleus and each subsegment as a satellite of that nucleus. This occurs when the multiple satellites bear the same RST relation to the nucleus. Alternatively, the G&S core and an adjacent subsegment may be analyzed as an RST nucleus and satellite, forming an RST span. This span is then the nucleus of a higher span in which the satellite is an additional G&S subsegment from the same segment. This occurs when the multiple satellites bear different relations to their nucleus.</Paragraph> <Paragraph position="22"> Because cores are a central aspect of the mapping between the two theories, and because cores are not part of the G&S proposal, it is natural to ask whether a segment necessarily has a core. Given the nature of segment purposes, a coreless segment seems intuitively unlikely. Recall that segment purposes, like the utterance intentions discussed by Grice, have the property that they are intended to achieve their effect in part from being recognized. The core has an important function: it manifests the purpose of the segment. Without a core, the segment purpose must be inferred from the subsegments alone. In such a case, the speaker intends that the hearer recognize a purpose, but does not supply an utterance that manifests that purpose.</Paragraph> <Paragraph position="23"> The question of whether or not coreless segments actually occur, however, is best answered by corpus analysis rather than theorizing. For our present purposes, we wish to consider the possibility of a coreless segment only because such a segment would complicate the mapping between the two theories presented above. In G&S, the definition of linguistic structure does not require a segment to contain a core. In the RST schemas considered thus far, a span always consists of a nucleus and satellite.</Paragraph> <Paragraph position="24"> A less common schema pattern, known as the joint schema, contains multiple spans with no nucleus-satellite distinction among them joined into a single span. Should a coreless segment occur in a G&S analysis, it can be mapped to a joint schema in RST.</Paragraph> <Paragraph position="25"> Building on the correspondence between dominance and nuclearity, we raise two issues in the following sections. First, how do informational relations fit into the discourse structure? Second, what synthesis of the two theories emerges when we recognize the correspondence? (a) Come home by 5.</Paragraph> <Paragraph position="26"> (b) Then we can go to the store before it closes.</Paragraph> <Paragraph position="27"> (a) Come home by 5:00.</Paragraph> <Paragraph position="28"> (b) Then we can go to the store before it closes.</Paragraph> </Section> <Section position="4" start_page="414" end_page="416" type="metho"> <SectionTitle> 5. Informational Structure </SectionTitle> <Paragraph position="0"> Moore and Pollack (1992) argued that RST defines two types of relations: intentional relations, which arise from the ways in which consecutive discourse elements participate in the speaker's plan to affect the hearer's mental state, and informational relations, which obtain between the content conveyed in consecutive elements of a coherent discourse. This is consistent with Mann and Thompson's (1988, 256) distinction between &quot;presentational&quot; (intentional) and &quot;subject matter&quot; (informational) relations. However, while Mann and Thompson maintain that for any two consecutive elements of a coherent discourse, one rhetorical relation will be primary (i.e., related by an informational or an intentional relation), Moore and Pollack showed that discourse interpretation and generation require that intentional and informational analyses exist simultaneously. Thus, in addition to the Intentional Linguistic Structure discussed so far, a discourse may simultaneously have an informational structure, imposed by domain relations among the objects, states, and events being discussed.</Paragraph> <Section position="1" start_page="414" end_page="415" type="sub_section"> <SectionTitle> 5.1 Can Intentional and Informational Structure Differ in RST? </SectionTitle> <Paragraph position="0"> In addition to their claim that intentional and informational analyses must co-exist, Moore and Pollack presented an example in which the intentional and informational relations can impose a different structure on the discourse. It is important to understand, however, that their example shows that the discourse structure determined by informational relations as defined in RST can be incompatible with the one determined by intentional relations. Here we argue that the problem is due to the inclusion of nuclearity in the definition of RST subject matter (informational) relations. As shown in Figure 3, the incompatibility arises because the nucleus and satellite of the intentional relation may be inverted in the RST informational relation. 3 In Section 4, we argued that nuclearity in an RST analysis is an implicit claim about Either relatum may be the nucleus when an instance of a domain relation is used.</Paragraph> <Paragraph position="1"> speaker intentions, corresponding to the G&S relation of dominance among intentions.</Paragraph> <Paragraph position="2"> That is, nuclearity rightly belongs in the definitions of intentional relations. In contrast, informational relations, properly construed, should not distinguish between nucleus and satellite in their definitions. As an example, consider the pair of RST relations volitional-cause and volitional-result. The volitional-cause relation is defined as one in which the nucleus presents a volitional action and the satellite presents a situation that could have caused the agent to perform the action. The effect of this relation is that the reader &quot;recognizes the situation presented in the satellite as a cause of the volitional action presented in the nucleus.&quot; The volitional-result relation is nearly identical except that the cause of the action is the nucleus and the result is the satellite. Why does RST need two relations to capture this? The reason is that the same domain relation, call it cause-effect, links a cause and effect regardless of which is the nucleus. In Figure 4, note that, while (a) causes (b), either (a) or (b) can be the nucleus of the relation. For a particular instance of a cause-effect in the domain, it is equally plausible for a speaker to mention the effect to facilitate the hearer's adoption of belief in the cause, as would be suggested by context I in Figure 4, or to mention the cause to facilitate the hearer's adoption of belief in the effect, as suggested by context II.</Paragraph> <Paragraph position="3"> Moreover, this is precisely what the intentional relations capture. By incorporating the nucleus-satellite distinction into the definitions of RST informational relations, these relations include an implicit analysis of intentional structure. As a consequence, strict application of the RST informational relations can result in a different structure than that imposed by the intentional relations, and this is the source of the problem noted by Moore and Pollack. Because nuclearity can only be determined by consideration of intentions, and intentional and informational analyses of a discourse must co-exist, we argue that the solution to the problem is to properly relegate information about nuclearity (intention dominance) to the intentional analysis, and remove it from definitions of informational relations. In this way, these two determinants of discourse structure cannot conflict. In addition, note that this is preferable to adding surplus informational relations to allow either relatum to be the nucleus (as was done in the volitional-cause and volitional-result case) because (1) this obscures the fact that relations such as volitional-cause and volitional-result appeal to the same underlying domain relation and (2) the proliferation of relations weakens the restrictive power of the framework.</Paragraph> </Section> <Section position="2" start_page="415" end_page="416" type="sub_section"> <SectionTitle> 5.2 Relationship between ILS and Informational Structure </SectionTitle> <Paragraph position="0"> Once we recognize that an informational analysis is needed simultaneously with ILS and that the informational analysis should be determined by domain relations without reference to how the relations are employed by the speaker, exactly how to determine informational structure becomes an underconstrained question. Should all domain Moser and Moore Discourse Structure relations across utterances be analyzed in the informational structure? What patterns of informational relations are employed in realizing various kinds of intentions, and what analysis provides a reliable means for identifying such patterns? Final answers to these questions require further research. Because constraints may be needed in order to make progress on these issues, we point out two approaches to constraining the definition of informational structure. In Section 6.2 we suggest that RST informational relations provide a version of one of these approaches.</Paragraph> <Paragraph position="1"> The most inclusive definition of informational structure would contain all the domain relations between the things being talked about. Included would be causal relations of various sorts, set relations, relations underlying bridging inferences (Clark and Haviland 1977), and the relation of identity between domain objects underlying coreference of noun phrases across utterances. By this definition, informational structure is a complex network of domain relations that is defined independently of the intentional structure. Keeping track of all domain relations in a discourse is an overwhelming task and is often infeasible. One approach to constraining informational structure is to define it as parasitic on intentional structure. The informational structure would contain an accompanying informational relation for each intentional relation. A second approach to constraining informational structure is to define it as a network of domain relations with type restrictions on the relata. The informational structure would contain only the relations among situations, events, and actions, that is, the types of entities referred to by clauses.</Paragraph> </Section> </Section> <Section position="5" start_page="416" end_page="418" type="metho"> <SectionTitle> 6. A Partial Synthesis </SectionTitle> <Paragraph position="0"> The discussion in Section 4 suggests that RST and G&S share a large amount of common ground. That is, many of the claims in the two theories, although formulated differently, are essentially equivalent. To begin this section, we state the common ground that emerges from relating dominance and nuclearity. Then we briefly review the claims of each theory that are outside this common ground. Each theory has some consistent ground, additional claims that concern issues simply not addressed by the other theory. The actual contentious ground, claims made by one theory that are incompatible with the other, is quite small.</Paragraph> <Section position="1" start_page="416" end_page="417" type="sub_section"> <SectionTitle> 6.1 Common Ground </SectionTitle> <Paragraph position="0"> Building on the correspondence between dominance and nuclearity, a partial synthesis of G&S and RST would be roughly the following: A segment/span arises because its speaker is attempting to achieve a communicative purpose. Such purposes have the feature that they are achieved in part by being recognized by hearers. Thus, the plan for achieving the purpose typically has two distinct parts: (1) one or more utterances that serve to make the purpose manifest by expressing a belief or action for the hearer to adopt (the core/nucleus) and (2) a set of subparts that contribute to achieving the purpose by manifesting subpurposes dominated by that purpose (the embedded segments / satellites).</Paragraph> <Paragraph position="1"> Note that this synthesis encompasses the ILS claims of both theories regarding the example discourse in Figure 1. DS0 is a segment/span designed to achieve the purpose I0. The plan for achieving I0 is to first manifest I0 by expressing the action in (a), the core/nucleus, and then to contribute to the achievement of I0 by providing the motivation in (b-c), the embedded segment/satellite. In turn, DS1 is a segment/span designed to achieve the purpose I1 by first manifesting I1 in the expression of the core/nucleus (b) and then providing evidence in the embedded segment/satellite (c).</Paragraph> <Paragraph position="2"> Computational Linguistics Volume 22, Number 3 Finally, I2 is made manifest by (c), though no additional contribution to achieving this intention is provided.</Paragraph> </Section> <Section position="2" start_page="417" end_page="417" type="sub_section"> <SectionTitle> 6.2 Consistent Ground </SectionTitle> <Paragraph position="0"> RST and G&S each makes claims about issues not addressed by the other theory. We review these claims briefly in order to establish that they are consistent.</Paragraph> <Paragraph position="1"> First, the two theories offer different but consistent perspectives on the ordering of segments/spans. In G&S, intentions may be related by satisfaction-precedence in addition to dominance. One intention satisfaction-precedes another when it must be realized before the other. This relation between intentions partially constrains the order of what is said and thus introduces a distinction between necessary order, originating with a satisfaction-precedence relation of the underlying intentions, and artifactual order, additional ordering that must be imposed to produce linearized text. G&S makes no claim about the relative ordering between a core and embedded segments.</Paragraph> <Paragraph position="2"> In RST, because the underlying intentions are not analyzed explicitly, the distinction between necessary and artifactual order is not available. Instead, the relative ordering of core/nucleus and embedded segment/satellite is highlighted. RST's authors claim that many relations have a typical ordering of their nucleus and satellite. The two theories address different aspects of ordering without suggesting any points of contention.</Paragraph> <Paragraph position="3"> Second, in addition to intentional and linguistic structure, G&S posits an attentional structure. This component determines which discourse entities will be most salient and thereby imposes constraints on available referents for pronouns and reduced definite NPs. This is an important issue, but one that RST simply does not make claims about. As noted earlier, the recognition of intentional structure is crucial for anaphora resolution, among other discourse-processing tasks. By synthesizing RST and G&S, work done using both approaches can be applied to accomplishing these tasks during interpretation and generation.</Paragraph> <Paragraph position="4"> Finally, while G&S recognize that informational structure is a cue to recognition of intentional structure, the theory does not provide detail. As discussed in Sections 5.1 and 5.2, the analysis of informational relations provided by RST is inadequate and incomplete. In either theory, more research is needed to understand how informational relations are used to achieve discourse intentions.</Paragraph> </Section> <Section position="3" start_page="417" end_page="418" type="sub_section"> <SectionTitle> 6.3 Contentious Ground </SectionTitle> <Paragraph position="0"> The claims of G&S and RST discussed so far have been, we argued, either equivalent or compatible. We now turn to a point of contention between the two theories.</Paragraph> <Paragraph position="1"> There are distinctions among the RST intentional relations that, in G&S, would be subtypes of the dominance relation among intentions. However, G&S specifies that the only relations among intentions affecting discourse structure are dominance and satisfaction-precedence. Should the various RST intentional relations be incorporated into a synthesized theory? The question may be approached from either an empirical or a practical perspective, and the two perspectives may lead to different answers. To answer the question empirically, one could code a corpus for its intentional relations and attempt to identify linguistic cues that correlate with distinctions among the relations. To answer the question practically, one would consider whether distinct intentional relations are useful for computational systems that generate and/or interpret natural language. In fact, the Moser and Moore Discourse Structure practical application of these intentional relations may be quite different in generation and interpretation systems. Further research is required to resolve this question.</Paragraph> </Section> </Section> class="xml-element"></Paper>