File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/88/p88-1002_intro.xml
Size: 5,514 bytes
Last Modified: 2025-10-06 14:04:43
<?xml version="1.0" standalone="yes"?> <Paper uid="P88-1002"> <Title>SENTENCE FRAGMENTS REGULAR STRUCTURES</Title> <Section position="3" start_page="0" end_page="7" type="intro"> <SectionTitle> 1. INTRODUCTION </SectionTitle> <Paragraph position="0"> In t\]~ paper we discuss the syntactic, semantic, and pragmatic analysis of fragmentary sentences in English. Our central claim is that these sentences, which have often been classified in the literature with truly erroneous input such as misspellings (see, for example, the work discussed in ~wnsny1980, Thompson1980, Kwnsny1981, Sondheimer1983, Eustman1981, Jensen1983\]), are regular structures which can be processed by adding a small number of rules to the grammar and other components of the system. The syntactic regularity of fragment structures has been demonstrated elsewhere, notably in ~/larsh1983, Hirschman1983\]; we will focus here upon the regularity of these structures across all levels of linguistic representation. Because the syntactic component regularizes these structures into a form almost indistinguishable from full tThis work has been supported in part by DARPA under contract N00014-85-C-0012, administered by the Office of Naval Research; by National Science Foundation contract DCR-85-02205; and by Independent R~D fuudinz from Systens Development Corporation, now part of Unisys Corporation. Approved for public release, distribution unlimited. assertions, the semantic and pragmatic components are able to interpret them with few or no extensions to existing mechanisms. This process of incremental regularisation of fragment structures~is possible only within a linguistically modular system. Furthermore, we claim that although fra~nents may occur more frequently in specialised sublanguages than in the standard grammar, they do not provide evidence that sublanguages are based on gra,~m*tical principles fundamentally different from those underlying standard languages, as claimed by ~itspatrick1986\], for example.</Paragraph> <Paragraph position="1"> This paper is divided into five sections. The introductory section defines fragments and describes the scope of our work. In the second section, we consider certain properties of sentence fragments which motivate a modular approach.</Paragraph> <Paragraph position="2"> The third section describes our implementation of processing for fragments, to which each component of the system makes a distinct contribution. The fourth section describes the temporal analysis of fragments. Finally, the fifth section discusses the status of sublanguages characterized by these telegraphic constructions.</Paragraph> <Paragraph position="3"> We define fragments as regular structures which are distinguished from full assertions by a missing element or elements which are normally syntactically obligatory. We distinguish them from errors on the basis of their regularity and consistency of interpretation, and because they appear to be generated intentionally. We are not denying the existence of true errors, nor that proceasing sentences containing true errors may require sophisticated techniques and deep reasoning. Rather, we are saying that fragments are distinct from errors, and can be handled in a quite general fashion, with minimal extensions to normal processing. Because we base the definition of /ragmer, t on the absence of a syntactically obligatory element, noun phrases without articles are not considered to be fragmentary, since this om;~sion is conditioned heavily by sem*ntlc factors such *s the mass vs. count distinction. However, we have implemented a pr*gm*tlcaliy based treatment of noun phrases without determiners, which is briefly discussed in Section 3.</Paragraph> <Paragraph position="4"> Fragments, then, *re defined here as elislons. We describe below the way in which these ore;**ions are detected and subsequently 'filled in' by different modules of the system.</Paragraph> <Paragraph position="5"> The problem of processing fragmentary sentences has arisen in the context of a l*rge-scnle natural language processing research project conducted at UNIsYs over the past five years ~almer1986, Hirschman1986, Dowding1987, Dahl1987\]. We have developed a portable, broad-coverage text-processing system, PUNDIT. 1 Our initial applications have involved v*rlons message types, including: field engineering reports for maintenance of computers; Navy maintenance reports (Casualty Reports, or CASR~S) for starting air compressors; Navy intelligence reports (~m~roRm); trouble and f*U~ reports (TEas) from Navy Vessels; and recently we have examined several medical domains (radiology reports, COmments fields from * DNA sequence database).</Paragraph> <Paragraph position="6"> At least half the sentences in these corpora are fragments; Table 1 below gives * summary of the fragment content of three domains, showing the percent of centers which are classified as fragments. (Centers comprise all sentence types: assertions, questions, fragments, and so forth.) The PUNDIT system is highly modular: it consists of a syntactic component, based on string grammar and restriction grammar \[Sager1981, Hirschman1985\]; a semantic component, based on inference-driven mapping, which decomposes predicating expressions into predicates and thematic roles ~almer1983, Palmerlg85\]; and a pragmatic* component which processes both referring expressions ~)ah11986\], and temporal expressions ~assonneau1987, Passonneau1988\].</Paragraph> </Section> class="xml-element"></Paper>