File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/93/e93-1003_abstr.xml

Size: 7,404 bytes

Last Modified: 2025-10-06 13:47:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="E93-1003">
  <Title>Experiments in Reusability of Grammatical Resources</Title>
  <Section position="2" start_page="0" end_page="12" type="abstr">
    <SectionTitle>
Abstract 1 Introduction
</SectionTitle>
    <Paragraph position="0"> Substantial formal grammatical and lexical resources exist in various NLP systems and in the form of textbook specifications. In the present paper we report on experimental results obtained in manual, semi-antomatic and automatic migration of entire computational or textbook descriptions (as opposed to a more informal reuse of ideas or the design of a single &amp;quot;polytheoretic&amp;quot; representation) from a variety of formalisms into the ALEP formalism. 1 The choice of ALEP (a comparatively lean, typed feature structure formalism based on rewrite rules) was motivated by the assumption that the study would be most interesting if the target formalism is relatively mainstream without overt ideological commitments to particular grammatical theories. As regards the source formalisms we have attempted migrations of descriptions in HPSG (which uses fully-typed feature structures and has a strong 'non-derivational' flavour), ETS (an un-typed stratificational formalism which essentially uses rewrite rules for feature structures and has run-time non-monotonic devices) and LFG (which is an un-typed constraint and CF-PSG based formalism with extensions such as existential, negative and global well-formedness constraints).</Paragraph>
    <Paragraph position="1"> 1 The work reported in this paper was supported by the CEC as part of the project ET10/52.</Paragraph>
    <Paragraph position="2"> Reusability of grammatical resources is an important idea. Practically, it has obvious economic benefits in allowing grammars to be developed cheaply; for theoreticians it is important in allowing new formalisms to be tested out, quickly and in depth, by providing large-scale grammars. It is timely since substantial computational grammatical resources exist in various NLP systems, and large scale descriptions must be quickly produced if applications are to succeed. Meanwhile, in the CL community, there is a perceptible paradigm shift towards typed feature structure and constraint based systems and, if successful, migration allows such systems to be equipped with large bodies of descriptions drawn from existing resources. In principle, there are two approaches to achieving the reuse of grammatical and lexical resources. The first involves storing or developing resources in some theory neutral representation language, and is probably impossible in the current state of knowledge. In this paper, we focus on reusability through migration--the transfer of linguistic resources (grammatical and lexical descriptions) from one computational formalism into another (a target computational formalism). Migration can be completely manual (as when a linguist attempts to encode the analyses of a particular linguistic theory in some computationally interpreted formalism), semi-automatic or automatic. The starting resource can be a paper description or an implemented, runnable grammar.</Paragraph>
    <Paragraph position="3"> The literature on migration is thin, and practical experience is episodic at best. Shieber's work (e.g.</Paragraph>
    <Paragraph position="4"> \[Shieber 1988\]) is relevant, but this was concerned with relations between formalisms, rather than on migrating grammars per se. He studied the extent to which the formalisms of FUG, LFG and GPSG could be reduced to PATlt-II. Although these stud- null ies explored the expressivity of the different grammar formalisms (both in the strong mathematical and in the functional sense, i.e. not only which class of string sets can be described, but also what can be stated directly or naturally, as opposed to just being encoded somehow or other), the reduction was not intended to be the basis of migration of descriptions written in the formalisms. In this respect the work described below differs substantially from Shieber's work: our goal has to be to provide grammars in the target formalisms that can be directly used for further work by linguists, e.g. extending the coverage or restructuring the description to express new insights, etc.</Paragraph>
    <Paragraph position="5"> The idea of migration raises some general questions. null * What counts as successful migration? (e.g.</Paragraph>
    <Paragraph position="6"> what properties must the output/target description have and which of these properties are crucial for the reuse of the target description?).</Paragraph>
    <Paragraph position="7"> * How conceptually close must source and target be for migration to be successful? * How far is it possible to migrate descriptions expressed in a richer formalism (e.g. one that uses many expressive devices) into a poorer formalism? For example, which higher level expressive devices can be directly expressed in a 'lean' formalism, which ones might be compiled down into a lean formalism, and which ones are truly problematic? Are there any general hints that might be given for any particular class of higher level expressive devices? When should effort be put into finding encodings for richer devices, and when should the effort go into simply extending the target formalism? * How important is it that the source formalism have a well-defined semantics? How far can difficulties in this area be off-set if the grammars/descriptions are well-documented? * How does the existence of non-monotonic devices within a source formalism effect migratability, and is it possible to identify, for a given source grammar, uses of these mechanisms that are not truly non-monotonic in nature and could thus still be modelled inside a monotonic description? null * To what extent are macros and preprocessors a useful tool in a step-wise migration from source to target? We can provide some answers in advance of experimentation. In particular, successful migration implies that the target description must be practically usablc that is, understandable and extensible. There is one exception to this, which is where a large grammatical resource is migrated solely to test the (run-time) capabilities of a target formalism. Practically, usability implies at least I/O equivalence with the source grammar but should .ideally also imply the preservation of general properties such as modularity, compactness and user-friendliness of the specification. null This paper reports on and derives some lessons from a series of on-going experiments in which we have attempted automatic, semi-automatic and manual migration of implemented grammatical and lexical resources and of textbook specifications, written in various 'styles', to the ALEP formalism (see below). The choice of ALEP was motivated by the assumption the study would be most interesting if the target formalism is relatively mainstream. 2 As regards the 'style' and expressivity of source formalisms, we have carried out migrations from HPSG, which uses fully-typed feature structures and a variety of richly expressive devices, from ETS grammars and lexicons 3 (ETS is an untyped stratificational formalism essentially using rewrite rules for feature structures), and from an LFG grammar 4 (LFG is a standard untyped AVS formalism with some extensions, with a CFG backbone).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML