File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/87/j87-3001_intro.xml
Size: 3,966 bytes
Last Modified: 2025-10-06 14:04:38
<?xml version="1.0" standalone="yes"?> <Paper uid="J87-3001"> <Title>PROCESSING DICTIONARY DEFINITIONS WITH PHRASAL PATTERN HIERARCHIES</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> INTRODUCTION </SectionTitle> <Paragraph position="0"> A major factor contributing to the lack of robustness of experimental natural language understanding systems is the small number of words in the experimental semantic dictionaries used by these systems. For example &quot;missing vocabulary&quot; is cited as the most frequent cause of errors for the FRUMP system (DeJong 1979), a system designed to achieve a high degree of robustness. The problem does not disappear when dealing with limited discourse domains of the type encountered in database query and expert system interfaces. This is because of the large number of synonyms and specialized words that can occur, and because of the difficulty of delimiting discourse domains exactly.</Paragraph> <Paragraph position="1"> A different problem faced by designers of natural language understanding systems is how to provide for graceful failure of sentence analysis. There is thus the need to produce reasonable incomplete interpretations of sentences when complete analyses are not possible.</Paragraph> <Paragraph position="2"> This situation can occur because of gaps in the gram*Author's present address: SRI International, Cambridge Computer Science Research Centre, Millers Yard, Mill Lane, Cambridge CB2 IRQ, England.</Paragraph> <Paragraph position="3"> matical knowledge of the system or because the system is faced with extragrammatical input. This paper shows how a possible solution to this partial analysis problem can be applied to the vocabulary problem in the context of large machine readable dictionaries.</Paragraph> <Paragraph position="4"> More specifically, we will see how word sense definitions from the Longman Dictionary of Contemporary English (Procter, 1978 -- henceforth LDOCE) are processed by a phrasal analyser that applies successively more specific phrasal analysis rules. The aim of this analysis is to provide sufficient semantic information to enable a system carrying out a language processing application to cope with occurrences of unknown words.</Paragraph> <Paragraph position="5"> Both the problem of coping with new words and the problem of robust phrasal analysis can be thought of as instances of a more general natural language interpretation problem. This is the problem of coping with incomplete knowledge of language use; lexical knowledge in the first case and knowledge of phrasal structure in the second. The unavoidable incompleteness of the knowledge of language use available to a language processing system means that trying to achieve robust natural language processing involves developing effective mechanisms for dealing with this problem. The research Copyright 1987 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the CL reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission. reported in this paper is intended to be a contribution to this development effort.</Paragraph> <Paragraph position="6"> The next two sections will discuss the kind of output that may be produced from processing dictionary definitions and give examples of the results of processing LDOCE definitions produced by an implemented definition analyser. Some problems that were encountered are then discussed. Later sections motivate and explain the basic analysis algorithm, and then describe and illustrate details of analysis and structure building rules. Finally some remarks are made about the performance of the current implementation and necessary further research.</Paragraph> </Section> class="xml-element"></Paper>