File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/94/w94-0328_abstr.xml

Size: 9,057 bytes

Last Modified: 2025-10-06 13:48:22

<?xml version="1.0" standalone="yes"?>
<Paper uid="W94-0328">
  <Title>Toward a Multidimensional Framework to Guide the Automated Generation of Text Types</Title>
  <Section position="1" start_page="0" end_page="230" type="abstr">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> A central concern limiting the sophistication of text generation systems today is the ability to make appropriate choices given the bewildering number of options present during the planning and realisation processes. As illustrated in several systems \[Hovy 88, Bateman &amp; Paris 89, Paris 93\], the same core communication may be realised in numerous different ways, depending (among other factors) on the nature and relation of the interlocutors, the context of the communication, the media employed, etc. The combinatoric number of possibilities of all such factors is extremely large. Since most of them are not well understood at this time, automated text generation may appear to be a hopeless endeavour.</Paragraph>
    <Paragraph position="1"> Fortunately, the picture is not altogether bleak. Given that certain types of communicative situations consistently give rise to characteristic recognisable genres or text types, one can attempt to characterise each genre or text type in terms of the set of generator decisions or rules responsible for producing those characteristics, and then create prespecified, genre-specific, collections of features, formulated as decision rule criteria, for subsequent use (this point has been made before, in \[Patten 88\] and \[Bateman &amp; Paris 89\]). With this aim in mind, two major  questions arise: 1. Is there a regular categorisation of genres or text types? 2. How can one most easily determine the genre-determining features for given texts?  In this paper we address both questions. First we report on work developing a functionally motivated framework to provide a matrix for the description, comparison, and classification of a body of texts. This framework can act as the background for research on discourse phenomena, text planning, and realisation, and can enable groups working with different texts to relativise their results in terms of the matrix. The approach involves a systematic search for correlations between linguistic form and function in discourse, a discovery of the relation between meaning and wordings that accounts for the organization of linguistic features in each text type. This task cannot be fully performed without linking the functions of particular linguistic features to variation in the communicative situation, since, as users and receivers of language, people degThe first author was supported by ARPA Contrax:t MDA-904-91-C-5224. The second author's portion is based on Deliverable Rl.l.la for DANDELION (ESPRIT Basic Research Project 6665).</Paragraph>
    <Section position="1" start_page="229" end_page="230" type="sub_section">
      <SectionTitle>
7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994
</SectionTitle>
      <Paragraph position="0"> produce texts whose communicative function has to be interpreted in terms of the concrete situation in which they were produced. The knowledge of the meaning potential associated with a generic situation is called register.</Paragraph>
      <Paragraph position="1"> Register has been the subject of much research in Linguistics \[Ferguson 83, Brown &amp; Fraser 79, Hymes 74\],especially in Systemic-Functional Linguistics \[Halliday ~ Hasan 89, Ure 71, Gregory 88, Ghadessy 88, Caiferel 1991\], etc. With.in SFL, various perspectives have been taken: I-Ialllday views register from the lexicogrammatical perspective, while \[Martin 92\] sees it operating at the semiotic level. With a phenomenon as complicated as register, it is inevitable that conflicting pictures exist; however, in this paper we do not devote too much time to any specific view, but rather take a slightly more general approach to make our points relevant to all. We view registers simply as stable configurations of features at all levels u semiotic, grammatical, lexical, phonological m linked together. In the first part of the paper, then, we outline several high-level and somewhat more general than usually provided register networks, drawn from a variety of sources and organized according to communicative metafunction.</Paragraph>
      <Paragraph position="2"> With regard to the second half of the paper, we describe a semi-automatic method to determine genre-defining features for a given text, and show how the degree of genre-specificity can be measured quantitatively. This follows on register-oriented work in computational research on language generation, in particular that of \[Patten 88, Bateman &amp; Paris 89, Bateman ~ Paris 91\].</Paragraph>
      <Paragraph position="3"> Our work in some ways follows upon that of Bateman and Paris, who outline an ambitious 5-step met-hod for the definition of register and the control of a generator program, using three variations of a sentence as illustration: 1. text analysis; 2. classification of features according to user; 3. classification of features with respect to register type; 4. creation of register networks; and 5. specification of generator control. We take a less ambitious and somewhat different approach to some of the same issues (steps 1, 3, and 4), and develop a semi-automated feature collection technique using as illustration 10 clauses from the instruction stage of a recipe. The contribution of this paper is twofold:  1. somewhat more high-level and comprehensive register networks, drawn from several sources and organized according to communicative metafunction (in contrast to steps 3 and 4); 2. a semi-automated abductive method for identifying grammatical features that are registerdefining (in contrast to step 1).</Paragraph>
      <Paragraph position="4"> 2 The components of the communicative situation  According to Halliday, language performs threeprincipal functions simultaneously: the ideational function (to understand the interlocutors' physical, mental, and emotional environment), the interpersonal.function (to act on other people in it); and the teztual .function (to employ the media and situation at hand for optimal communication) \[Halliday 85\]. In a each instantiated communication, the speaker performs a series of linguistic choices from these three metafunctions of language: in Systemic terms, he or she selects features from language-based system networks assigned to the three different functions.</Paragraph>
      <Paragraph position="5"> The communicative situation -- topic, interlocutors, context, etc. -- is closely correlated with and helps determine the configuration of meanings selected from these three functional components of language. Given this correlation, each particular communicative situation is partitioned into three regions corresponding to the linguistic ones: the experiential meanings of the text reflect the FIELD, the interpersonal meanings reflect the TENOR, and the textual  meanings reflect the MODE of the discourse. We can say that field, tenor, and mode are the actual selections (from the ideational, the interpersonal, and the textual components of the language code respectively) taken in a particular event surrounding and including the language act.</Paragraph>
      <Paragraph position="6"> In the remainder of this section we briefly describe the three aspects of communication. More details are provided in the longer version of this paper, available from the authors.</Paragraph>
      <Paragraph position="7"> The field O f discourse. According to Halllday, the field of discourse refers to &amp;quot;what is happening, to the nature of the social action that is taking place: what is it that the participants are engaged in, in which the language figures as an essential component&amp;quot; \[Halliday &amp; Hasan 89\]. The field of discourse can also be called the text's experiential domain which includes the text's subject matter, that is, its ideational or propositional content. The network in Figure 1 illustrates these aspects.</Paragraph>
      <Paragraph position="8"> The tenor of discourse. Where field predicts the range of meaning potentials in the experiential component of the language code, the tenor of discourse predicts the selection of options in the interpersonal component. According to Haniday and Hasan, &amp;quot;the tenor of discourse refers to who is taking part, to the nature of the participants, their statuses and roles: what kinds of role relationships obtain aanong the participants..., both the types of SPEECH ROLE that they are taking on in the dialogue and the whole cluster of socially significant relationships in which they are involvdd&amp;quot; (\[Halliday &amp; Ha.san 89\], p. 27). The tenor of discourse involves the selection of a number of options in the subsystems that configure the participants' speech roles. Among these speech roles we distinguish two principal types: one set of systems is concerned with the</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML