File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/83/j83-3001_abstr.xml
Size: 7,034 bytes
Last Modified: 2025-10-06 13:46:07
<?xml version="1.0" standalone="yes"?> <Paper uid="J83-3001"> <Title>Recovery Strategies for Parsing Extragrammatical Language 1</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> Any robust natural language interface must be capable of processing input utterances that deviate from its grammatical and semantic expectations. Many researchers have made this observation and have taken initial steps towards coverage of certain classes of extragrammatical constructions. Since robust parsers must deal primarily with input that does meet their expectations, the various efforts at coping with extra-grammaticality have been generally structured as extensions to existing parsing methods. Probably the most popular approach has been to extend syntactically-oriented parsing techniques employing Augmented Transition Networks (ATNs) (Kwasny and Sondheimer 1981, Weischedel and Sondheimer 1984, Weischedel and Black 1980, Woods et al. 1976). Other researchers have attempted to deal with ungrammatical input through network-based semantic grammar techniques (Hendrix 1977), through extensions to pattern matching parsing in which partial pattern matching is allowed (Hayes and Mouradian 1981), through conceptual case frame instantiation (Dejong 1979, Schank, Lebowitz, and Birnbaum 1980), and through approaches involving multiple cooperating parsing strategies (Carbonell and Hayes 1984, Carbonell et al. 1983, Hayes and Carbonell 1981).</Paragraph> <Paragraph position="1"> project.</Paragraph> <Paragraph position="2"> Given the background of existing work, this paper focuses on three major objectives: 1. to create a taxonomy of possible grammatical deviations covering a broad range of extragrammaticalities, including some lexical and discourse phenomena (for example, novel words and dialogue level ellipsis) that can be handled by the same mechanisms that detect and process true grammatical errors; 2. to outline strategies for processing many of these deviations - some of these strategies have been presented in our earlier work, some are similar to strategies proposed by other researchers, and some have never been analyzed before; 3. to assess how easily these strategies can be employed in conjunction with several of the existing approaches to parsing ungrammatical input, and to examine why mismatches arise.</Paragraph> <Paragraph position="3"> The overall result should be a synthesis of different parse-recovery strategies organized by the grammatical phenomena they address (or violate), an evaluation of how well the strategies integrate with existing approaches to parsing extragrammatical input, and a set of characteristics desirable in any parsing process dealing with extragrammatieal input. We hope this will aid researchers designing robust natural language interfaces in two ways: 1. by providing a tool chest of computationally effective approaches to cope with extragrammaticality; null Copyright 1984 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the Journal reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission. 0362-613X/83/030123-24503.00 American Journal of Computational Linguistics, Volume 9, Numbers 3-4, July-December 1983 123 Jaime G. Carbonell and Philip J. Hayes Recovery Strategies for Parsing Extrammatical Language 2. by assisting in the selection of a basic parsing methodology in which to embed these recovery techniques.</Paragraph> <Paragraph position="4"> In assessing the degree of compatibility between recovery techniques and various approaches to parsing, we avoid the issue of whether a given recovery technique can be used with a specific approach to parsing. The answer to such a question is almost always affirmative. Instead, we are concerned with how naturally the recovery strategies fit with the various parsing approaches. In particular, we consider the computational tractability of the recovery strategies and how easily they can obtain the information they need to operate in the context of different parsing approaches.</Paragraph> <Paragraph position="5"> The need for robust parsing is greatest for interactive natural language interfaces that have to cope with language produced spontaneously by their users. Such interfaces typically operate in the context of a welldefined, but restricted, domain in which strong semantic constraints are available. In contrast, text processing often operates in domains that are semantically much more open-ended. However, the need to deal with extragrammaticality is much less pronounced in text processing, since texts are normally carefully prepared and edited, eliminating most grammatical errors and suppressing many dialogue phenomena that produce fragmentary utterances. Consequently, we shall emphasize recovery techniques that exploit and depend on strong semantic constraints. In some cases, it is unclear whether the techniques we discuss will scale up properly to unrestricted text or discourse, but even where they may not, we anticipate that their use in the restricted situation will provide insights into the more general problem.</Paragraph> <Paragraph position="6"> Before proceeding with our discussion, the term extragrammaticality requires clarification. Extragrammaticalities include patently ungrammatical constructions, which may nevertheless be semantically comprehensible, as well as lexical difficulties (for example, misspellings), violations of semantic constraints, utterances that may be grammatically acceptable but are beyond the syntactic coverage of the system, ellipsed fragments and other dialogue phenomena, and any other difficulties that may arise in parsing individual utterances. An extragrammaticality is thus defined with respect to the capabilities of a particular system, rather than with respect to an absolute external competence model of the ideal speaker.</Paragraph> <Paragraph position="7"> Extragrammaticality may arise at various levels: lexical, sentential, and dialogue. The following sections examine each of these levels in turn, classifying the extragrammaticalities that can occur, and discussing recovery strategies. At the end of each section, we consider how well the various recovery strategies would fit with or be supported by various approaches to parsing. A final section discusses some experimental robust parsers that we have implemented. Our experience with these parsers forms the basis for many of the observations we offer throughout the paper.</Paragraph> <Paragraph position="8"> We also discuss some more recent work on integrating many of the recovery strategies considered earlier into a single robust multi-strategy parser for restricted domain natural language interpretation.</Paragraph> </Section> class="xml-element"></Paper>