File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-1427_intro.xml

Size: 3,756 bytes

Last Modified: 2025-10-06 14:01:05

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1427">
  <Title>Robust, Applied Morphological Generation .... ......... ..... . _</Title>
  <Section position="2" start_page="0" end_page="201" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Most approaches to natural language generation (NLG) ignore morphological variation during word choice, postponing the computation of the actual word forms to be output to a final stage, sometimes termed 'linearisation'. The advantage of this setup is that the syntactic/lexical realisation component does not have to consider all possible word forms corresponding to each lemma (Shieber et al., 1990). In practice, it is advantageous to have morphological generation as a postprocessing component that is separate from the rest of the NLG system. A benefit is that since there are no competing claims on the representation framework from other types of linguistic and non-linguistic knowledge, the developer of the morphological generator is fl'ee to express morphological information in a perspicuous and elegant manner. A further benefit is that localising morphological knowledge in a single component facilitates more systematic and reliable updating. From a software engineering perspective, modularisntion is likely to reduce system development costs and increase system reliability. As an individual module, the morphological generator will be more easily shareable between several different NLG applications, and integrated into new ones. Finally, such a generator can be used on its own in other types of applications that do not contain a standard NLG syntactic/lexical realisation component, such as text simplification (see Section 3). In this paper we describe a fast and robust generator for the inflectional morphology of English that generates a word form given a specification of a lemma, a part-of-speech (PoS) label, and an inflectional type. The morphological generator was built using data from several large corpora and machine readable dictionaries. It does not contain an explicit lexicon or word-list, but instead comprises a set of morphological generalisations together with a list of exceptions for specific (irregular) word forms.</Paragraph>
    <Paragraph position="1"> This organisation into generalisations and exceptions can save time and effort in system development since the addition of new vocabulary that has regular morphology does not require any changes to the generator. In addition, the generalisation-exception architecture can be used to specify--and also override--preferences in cases where a lemma has more than one possible surface word form given a particular inflectional type and PoS label.</Paragraph>
    <Paragraph position="2"> The generator is packaged up as a Unix 'filter', making it easy to integrate into applications. It is based on efficient finite-state techniques, and is implemented using the widely available Unix Flex utility (a reimplementation of the AT&amp;T Unix Lex tool) (Levine et al., 1992). The generator is freely available to the NLG research comnmnity (see Section 5 below).</Paragraph>
    <Paragraph position="3"> The paper is structured ms follows. Section 2 describes the morphological generator and eval- null uates its accuracy. Section 3 outlines how the (1) {h}+&amp;quot;s+s_.N&amp;quot; generator is put ..to use in.a prototy.p~.system for.:.: ........... :...: ~-.:=..{a=e..tnxnfnp_~ord_:form (1, !~es&amp;quot;-) ).; } automatic simplification of text, and discusses a number of practical morphological and orthographic issues that we have encountered. Section 4 relates our work to that of others, and we conclude (Section 5) with directions for future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML