File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/w96-0502_metho.xml
Size: 11,471 bytes
Last Modified: 2025-10-06 14:14:26
<?xml version="1.0" standalone="yes"?> <Paper uid="W96-0502"> <Title>SPLAT: A sentence-plan authoring tool</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 The Penman system: The need </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> for authoring 2.1 The input to Penman: Sentence Plan Language </SectionTitle> <Paragraph position="0"> The Penman generation system \[Penman, 1989\] is one of the most comprehensive natural language generation systems in the world. It contains a very large systemic functional grammar of English, the Nigel grammar, and an extensive semantic ontology, the upper model. Input to the Penman system is defined by the Sentence Plan Language \[Penman, 1991\] and given in the form of SPL plans. The system processes the SPL input by querying different knowledge resources, including the Nigel grammar, the upper model, and a domain model, eventually producing a realization of the SPL plan in the form of an English sentence.</Paragraph> <Paragraph position="1"> Each SPL plan contains one or more head concepts from the upper model and a variable that enables the plan to be referred to by other SPL plans. The head concept can be modified by a number of keywords, signaUing the underlying grammar to generate different sentence &quot;patterns. For example, one common type of keyword is a relation, whose value may be another SPL plan, or a reference to such a plan.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Why using SPL is hard </SectionTitle> <Paragraph position="0"> One of the major difficulties in learning to use Penman has been acquiring the expertise to construct SPL plans. The vast number of possible keywords and values, coupled with the absence of any facility for storing and searching these items, made the task of building SPL plans very frustrating for the novice Penman user. In contrast, the experienced SPL developer would often draw on knowledge of previously constructed SPL plans in order to recycle bits of partial SPL plans, but had no convenient way to store and access this information. Various resources, the upper model, for instance, play an essential role in providing knowledge to guide the development of the SPL plan, but were virtually inaccessible to all but the Penman expert. Support for managing the construction of SPL plans and accessing the necessary resources in a systematic and user-friendly manner was almost completely lacking.</Paragraph> <Paragraph position="1"> 3 An authoring tool for SPL SPLAT is a Sentence Plan Language Authoring Tool that has been developed as part of the HealthDoc project \[Jakeway, 1995, Di-Marco et al., 1995\] and aims to address many of the earlier difficulties in building the input SPL specifications for Penman. SPLAT allows the user to create SPL plans in a supportive on-line environment: a graphical, menu-driven interface provides guidance on the allowable structure of SPL plans and access to the various Penman resources, such as the upper model and the generator itself. As well, SPLAT provides an extensible bank of representative sentences and their SPL structures, from which the user can create new sentence plans. An important feature of SPLAT is that it provides a new view of the upper model.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 The modelling approach </SectionTitle> <Paragraph position="0"> SPLAT draws on some aspects of modelling theory in helping the user build SPL plans, specifically, by giving examples of SPL-plan templates and prefabricated SPL plans. The templates supply most of the format specific to a particular kind of SPL plan, thus reducing the need for the user to memorize SPL-plan syntax. This feature also helps reduce the possibility of syntactic errors in plan building. In addition, having a template display the allowable components of a particular type of SPL plan guides the user in exploring the Penman system and in learning how the different parts of an SPL plan interact.</Paragraph> <Paragraph position="1"> Previously constructed SPL plans provide models that may be modified or incorporated into the new plan. This extensible example set, the sentence bank, provides positive examples of how to construct SPL-plan templates. The user can retrieve the plan for a particular token in the sentence bank and modify it, or use it to aid in the construction of a new plan.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 Remodelling the upper model </SectionTitle> <Paragraph position="0"> The Penman upper model, a classification of various semantic concepts, has traditionally been divided into three disjoint concept hierarchies, a high-level split between the major semantic abstractions of English: processes, objects, and qualities. Processes can be thought of as verbs and other relational concepts, objects as nouns, and qualities as modifiers of processes and objects, i.e., adverbs and adjectives. However, early in the construction of SPLAT, it was noted that processes should be divided into two categories, those which modify the ideational content of a sentence and those which dictate the textual structure of a sentence. The former category describes most of the process hierarchy (i.e., verbs and most relations), whereas the latter describes the logical and rhetorical relations. Adding this category to SPLAT takes the upper model closer to the semantic classification described by \[Matthiessen, 1991\], with textual functions being separated from ideational functions.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 Building SPL templates </SectionTitle> <Paragraph position="0"> SPLAT provides te.mplate forms for each type of SPL plan: relations, processes, objects, and qualities. The user need only frill in appropriate values on the selected template.</Paragraph> <Paragraph position="1"> As well, for processes, the template is not a static structure, but changes for each type of process according to its roles, which are retrieved from the underlying knowledge base.</Paragraph> <Paragraph position="2"> For example, the template for a verbal process will display the roles relevant to this type of process: sayer, addressee, and saying. For each kind of template, SPLAT provides most of the roles necessary to construct this type of SPL plan.</Paragraph> <Paragraph position="3"> Each template also provides a facility to add keywords and values that are not present on the form. The template also gives the user access to a number of different tools and resources, including the actual SPL that gets constructed, and the generator's output from the constructed SPL. At any point in the development process, the user can choose to produce the partially built SPL plan or generate (through Penman) the English realization of the current structure.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.4 The sentence bank </SectionTitle> <Paragraph position="0"> SPLAT stores the pre-built SPL plans for a set of sample sentences in a sentence bank. Each token of a sentence in the sentence bank is connected back to the SPL-plan template associated with it; i.e., the templates for a particular sentence and for its components are directly accessible from the sentence bank.</Paragraph> <Paragraph position="1"> Users can search the sentence bank to find examples of a particular sentence or partial sentence pattern. The corresponding SPL-plan template can then be used as the model or component of the new SPL plan being built. As users develop their own SPL plans, they can add them to the sentence bank by choosing the annotation feature on the current template.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.4.1 The purpose of annotations </SectionTitle> <Paragraph position="0"> Each word in the sentence bank is annotated with up to five levels of annotation: spelling, lexical item, part of speech, grammatical function, and upper-model concept. The spelling corresponds to the actual spelling of the word in the sentence and is retrieved from the generator output. The lexical item is the lexical unit used by Penman to generate the word, derived from the SPL input. The part-of-speech annotation for a particular word is derived from Brill's \[1994\] part-of-speech tagger. Before a sentence is entered into the sentence bank, it is passed to the tagger to determine the part of speech of each word in the sentence. The grammatical function of a word is derived from the output of the Penman generator. 1 The upper-model concept attributed to the word is retrieved from the SPL input. Not all words of a sentence will be annotated at each level, as some annotation levels might not apply to a particular word.</Paragraph> <Paragraph position="1"> Table I shows how SPLAT would annotate a sample sentence. Notice that the semantically more important words have full annotations, whereas the support words do not have lexical or conceptual information. These support words are grouped with their associated semantically meaningful words into tokens. These tokens correspond to SPL plans.</Paragraph> <Paragraph position="2"> For example, the word were is an auxiliary to the word produced from the concept CREA TI VE-MA TERIAL-A CTIO N because of tense requirements. (The cryptic syntactic part-of-speech tags are from the Penn Tree-bank tagset.) In the sentence bank, each token is presented as a unit that is linked to the underlying SPL-plan template, so that the template can be edited.</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.4.2 Searching the sentence bank </SectionTitle> <Paragraph position="0"> The sentence bank contains a number of sample sentences which illustrate various types of SPL plans and constructions. The user may want to make use of existing plans, either using them to guide the construction of new plans, or modifying them to create new ones. To determine which sentence plan to use, the user searches the sentence bank with a pattern indicating the desired settings ISPLAT uses KPML \[Bateman, 1995\] version 0.8 or later to retrieve these values, a.s the KPML generator can easily return the full structure output of a sentence, whereas this information is much harder to retrieve from the standaxd Penman generator.</Paragraph> <Paragraph position="1"> spelling lezical item part-of-speech grammatical semantic people were making a new ship.</Paragraph> <Paragraph position="2"> for the annotation levels. SPLAT will retrieve all the sentences in the sentence bank which match the pattern. For example, if the sentenc,~ in Table 1 was in the sentence bank, and if the pattern specified that a word with a syntactic part-of-speech DT (determiner) was to be followed by a word whose lexical item was PERSON, then it would be retrieved. If, however, the pattern was the word those, followed by any number of words, followed by a word with a concept matching BELIEVE, then the sentence would not be retrieved.</Paragraph> </Section> </Section> class="xml-element"></Paper>