File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/e99-1033_intro.xml
Size: 4,968 bytes
Last Modified: 2025-10-06 14:06:50
<?xml version="1.0" standalone="yes"?> <Paper uid="E99-1033"> <Title>Investigating NLG Architectures: taking style into consideration</Title> <Section position="2" start_page="0" end_page="237" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> We started our research with a survey of 19 applied natural language generation (NLG) systems (Paiva, 1998) and noticed that: * almost all the systems followed a pipeline model; * there was a general agreement on the core NLG tasks that a system should perform (e.g., aggregation, lexicalisation, etc.); * the surveyed systems mainly differed on the order the NLG tasks are executed (see Cahill et al., 1999); We also noticed that the texts produced by the various systems were apparently quite different stylistically (although we did not have a formal method to measure how much different they were) and, in order to explain how this variety of texts was obtained with the same type of architecture (i.e., pipeline), we have put forward the hypothesis that the order in which the NLG tasks are executed influences the kind of text that can be obtained, i.e. a certain order would facilitate the generation of a certain type of text whilst another would not.</Paragraph> <Paragraph position="1"> This hypothesis goes in line with other researchers' results which purport to show that architectural aspects of a NLG system depend on the characteristics of the text to be generated and vice versa. For instance: * Robin (1994) argues for a revisionincremental architecture for the generation of structurally complex text with floating content -- i.e., content that can appear anywhere in the text and is opportunistically realised only if stylistic factors of the surrounding text allow; * Inui and colleagues (1992) conclude that in order to avoid the generation of text with ambiguous and complex sentences a revision architecture is necessary, with the revision module placed at the end of the generation process (i.e., after the linguistic realisation)l; * Reiter (ms.) reports that pipeline architectures cannot deal with constraints on the length of a text.</Paragraph> <Paragraph position="2"> It is difficult however to reconcile their results in a unified perspective since each of these reported works started with a different perspective and, generally, had different aims in mind.</Paragraph> <Paragraph position="3"> We believe that it is possible to relate those characteristics of text presented above (such as complexity, ambiguity and sentence length) 2 to style, 3 and that we can gain insight into NLG architectures having a systematic way to classify texts by their stylistic properties so that we can analyse the architectural aspects in relation to this stylistic classification.</Paragraph> <Paragraph position="4"> We then start with the point of view that it is reasonable to assume that certain styles of text demand a more specialised type of architecture than others (for example, a revision versus a pipeline architecture 4) and our idea is to develop a methodology for studying which are the appro-Some aspects related to the complexity of a sentence (e.g., sentence length) can only be measured precisely when the text has already been generated.</Paragraph> <Paragraph position="5"> 2 For instance, complexity and (lack of) ambiguity are factors that can be related to the 'clarity&quot; of a text. In this figure, one group expresses texts that are formal, concise, and not interactive (text type 1 -- a possible example is news columns in scientific maga2ines). Another group (text type 2) expresses texts that are informal, highly concise, and highly interactive (e.g., short articles in 'IV magazines answering readers' questions). A third one (text type 3) can be considered neither formal nor informal, is not concise and has a medium value and the partition of the corpus into three text types (B). priate architectures for the generation of texts in a specific style or more than one style.</Paragraph> <Paragraph position="6"> Hovy (1988) used a similar approach but characterised style in such an informal way that its relation to architectural aspects was compromised; in particular, he could not ensure that he was not missing important relations between style and generator decisions s.</Paragraph> <Paragraph position="7"> In this paper we will present a characterisation of style and an approach for dealing with it which, we hope, will provide a means to clarify the interaction between the architectures of NLG systems and the type of texts they can, or need to, generate. The paper continues in the following way: in Section 2 we present the definition of style we are working with and in Section 3 we show how this characterisation will help us to deal with aspects of architecture. In Section 4 we discuss the expected results and, finally, we conclude by presenting where we are in this process.</Paragraph> </Section> class="xml-element"></Paper>