File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-1507_intro.xml
Size: 2,750 bytes
Last Modified: 2025-10-06 14:06:28
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-1507"> <Title>Application-driven automatic subgrammar extraction</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Although we have reached a situation in computational linguistics where large coverage grammars are well developed and available in several formal traditions, the use of these research results in actual applications and for application to specific domains is still unsatisfactory. One reason for this is that large-scale grammar specifications incur a seemingly unnecessarily large burden of space and processing time that often does not stand in relation to the simplicity of the particular task. The usual alternatives for natural language generation to date have been the handcrafted development of application or 1This work was partially supported by the DAAD through grant D/96/17139.</Paragraph> <Paragraph position="1"> sublanguage specific grammars or the use of template based generation grammars. In (Busemann, 1996) both approaches are combined resulting in a practical small generation grammar tool. But still the grammars are handwritten or, if extracted from large grammars, must be adapted by hand. In general, both - the template and the handwritten application grammar approach - compromise the idea of a general NLP system architecture with reusable bodies of general linguistic resources.</Paragraph> <Paragraph position="2"> We argue that this customization bottleneck can be overcome by the automatic extraction of application-tuned consistent generation subgrammars from proved given large-scale grammars. In this paper we present such an automatic subgrammar extraction tool. The underlying procedure is valid for grammars written in typed unification formalisms; it is here carried out for systemic grammars within the development environment for text generation KPML (Bateman, 1997). The input is a set of semantic specifications covering the intended application. This can either be provided by generating a predefined test suite or be automatically produced by running the particular application during a training phase.</Paragraph> <Paragraph position="3"> The paper is structured as follows. First, an algorithm for automatic subgrammar extraction for arbitrary systemic grammars will be given, and second the application of the algorithm for generation in the domain of 'encyclopedia entries' will be illustrated. To conclude, we discuss several issues raised by the work described, including its relevance for typed unification based grammar descriptions and the possibilities for further improvements in generation time.</Paragraph> </Section> class="xml-element"></Paper>