File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/92/c92-3158_abstr.xml
Size: 4,208 bytes
Last Modified: 2025-10-06 13:47:28
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-3158"> <Title>Generation of Extended Bilingual Statistical Reports</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> During tim past few years we liave been concerned with developing models for the automatic planning and realization of report texts wittlin technical sub-languages of English and French. Since 1987 we have been implementing Meaning-Text language models (MTMs) \[6, 7\] for the task of realizing sentences from semantic specifications that are output by a text planner. A relatively complete MTM implementation for English was tested in the domain of operating system audit summaries in tile Gossip project of 1987-89 \[3\]. At COLING-gO a report was given on the fully operational FoG system for generating marine forecasts in both English and French at weather centres in Eastern Canada \[1\]. The work reported on here concerns the experimental generation of extended bilingual summaries of Canadian statistical data. Our first focus has been on labour force surveys (LFS), where an extensive corpus of published reports in each language is available for empirical study. Tire current LFS system has built on the experience of the two preceding systems, but goes beyond either of them 1. Iu contrast to FoG, but similar to Gossip, LFS uses a semantic net representation of sentences as input to the realization process. Like Gossip, LFS also makes use of theme/theme constraints to help optimize lexical and syntactic choices during sentence realizatiou. But in contrast to Gossip, which produced only English texts, LFS is bilingual, making use of the conceptual level of representation produced by the planner as an interlingua from which to derive the linguistic semantic representations for texts in the two languages independently. Hence the LFS interlingua is much &quot;deeper&quot; than FoG's deep-syntactic interlingua. This allows us to iutroduce certain semantic differences between English and I,Y=ench sentences that we observe in natural &quot;translation twin&quot; texts.</Paragraph> <Paragraph position="1"> 1The LFS Bystem is being developed by CoGenTex Inc.</Paragraph> <Paragraph position="2"> under contract 36902-O-0749/Ol-XAF wich Communications Canada, Canadian Workplace Automation Research Centre.</Paragraph> <Paragraph position="3"> Tim first four authors have current academic affiliations with the Universlt6 de Montr~al. Polgu6re is now at the National University of Singapore.</Paragraph> <Paragraph position="4"> LFS is based on a much more detailed text planning process than was attempted earlier, and results in texts of much greater length and complexity. For example, sentence order within certain parts of statistical texts depends on data salience, therefore requiring locally dynamic text planning. Text planning also includes tests that allow for appropriate use of certain quantifier expressions (e.g., all, mosO, evaluative words such as also and only, and intra-sentential pronominalization.</Paragraph> <Paragraph position="5"> LFS also incorporates some substantial extensions in our use of the Meaning-Text framework. First, it makes more use of lexical functions (ef.\[8\]), the mechanism in MTMs that allows computation of appropriate collocations and semautieally related lexemes needed ill paraphrasing and ill conflict resolution during generation. Second, the grammar is more extensive, covering important types of conjunction and ellipsis.</Paragraph> <Paragraph position="6"> Generation in the domain of employment statistics is not new. Roesner's Semtex system \[10\] produced German (and later, English) summaries of such data that are remarkably similar in style as well as coutent to our own. The difference lies in our use of a powerflfl linguistic model that promises to simplify the problem of scaling up the generator to more complex and varied texts, or extend them to other varieties of text. Furthermore, the LFS project is using feedback from domain experts to refine tile rules nsed in both text planning and realization.</Paragraph> </Section> class="xml-element"></Paper>