XML Viewer - p88-1002

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/p88-1002_metho.xml
Size: 28,009 bytes
Last Modified: 2025-10-06 14:12:12
<?xml version="1.0" standalone="yes"?>
<Paper uid="P88-1002">
  <Title>SENTENCE FRAGMENTS REGULAR STRUCTURES</Title>
  <Section position="4" start_page="7" end_page="7" type="metho">
    <SectionTitle>
1 Prolog UNDer#h;~isO ol l~tzgr~zd Teal
2. DIVISION OF LABOR AMONG SYN-
TAX, SEMANTICS, AND PRAGMATICS
</SectionTitle>
    <Paragraph position="0"> We argue here that sentence fragments provide a strong case for linguistically modular systems such as PUNDIT, because such elislons have distinct consequences *t different levels of linguistic description. Our approach to fragments can be snmm*rlsed by saying that syntax detects 'holes' in surface structure and creates dummy elements as piaceholders for the missing elements; semantics and pragmatics interpret these placeholders at the appropriate point in sentence processing, utllising the same mechanisms for fragments *s for full assertions.</Paragraph>
    <Paragraph position="1"> Syntax regulates the holes. Fragment eUsions cannot be accounted for in purely semantlc/pragmatic terms. This is evidenced by the fact that there *re syntactic restrictions on om;nlons; the acceptability of a sentence fragment hinges on gramm*tlcal factors rather than, e.g., how readily the elided material can be inferred from context. For example, the discourse Old howe too small. *New one ~ be larger titan _ was (where the elided object of t~an is understood to be old howe) is Ul-formed, whereas a comparable discourse First repairman ordered new air eonditiom~r. Second repairman will inltali_ (where the elided object of inJto//is understood to be air eoaditloasr) is acceptable. In both cases above, the referent of the elided element is available from context, and yet only the second elilpsis sounds well-formed. Thus *n appreciation of where such ellipses may occur is part of the lingu, t/e knowledge of speakers of English and not simply a function of the contextual salience of elided elements. Since these restrictions concern structure rather than content, they would be d;~cult or impossible to state in * system such *s a 'pure' semantic grammar which only recognised such omissions at the level of semantic/pragmatic representation.</Paragraph>
    <Paragraph position="2"> Furthermore, it matters to semantics and pragmatic* HOW an argument is omitted. The syntactic component must tell sem*ntlcs whether a verb argument is re;Ring bec*use the verb is used intransitively (as in The tiger was eating, where the patient argument is not specified) or because of * fragment ellipsis (as in Eaten bl/ a tiger, where the patient argument is missing because the subject of a passive sentence has been elided). Only in the latter case does the missing argument of eat function *s *n antecedent subsequently in the discourse: compare Eaten by a tiler. Had mcreamed bloody murder right before tKe attack (where the victim and the screamer are the same) vs. TKe tiger teas eating.</Paragraph>
    <Paragraph position="3"> Had screamed bloody murder right before tKe attack (where it is dlmcnlt or impossible to get the reading in which the victim and the screamer are the same).</Paragraph>
    <Paragraph position="4"> Semantles and pragmstles fill the holes.</Paragraph>
    <Paragraph position="5"> In PUNDIT's treatment of fragments, each component contributes exactly what is appropriate to the specification of elided elements. Thus the syntax does not attempt to 'fill in' the holes that it discovers, unless that information is completely predictable given the structure at hand. Instead, it creates * dummy element. If the missing element is an elided subject, then the dummy element created by the syntactic component is assigned a referent by the pragmatics component.</Paragraph>
    <Paragraph position="6"> This referent is then assigned * thematic role by the semantics component llke any other referent, and is subject to any selectlonal restrictions atomcinted with the thematic role assigned to it. If the missing element is a verb, it is specified in either the syntactic or the semantic component, depending upon the fragment type.</Paragraph>
  </Section>
  <Section position="5" start_page="7" end_page="7" type="metho">
    <SectionTitle>
|. PROCESSING FRAGMENTS IN PUN-
DIT
</SectionTitle>
    <Paragraph position="0"> Although the initial PUNDIT system wu designed to handle full, as opposed to fragmentary, sentences, one of the interesting results of our work is that it has required only very minor changes to the system to handle the basic fragment types introduced below. These included the additions of: 6 fragment BNF definitions to the grammar (a 5~ increase in grammar size) and 7 context-sensitive restrictions (a 12~o increase in the number of restrictions); one semantic rule for the interpret**ion of the dummy element inserted for missing verbs; * minor modification to the reference resolution mechanism to treat elided noun phrases llke pronouns; and a small addition to the temporal processing mechanism to handle tenseless fragments. The small number of changes to the semantic and pragmatic components reflects the fact that these components are not 'aware' that they are interpreting fragmentary structures, because the regularlsatlon performed by the syntactic component renders them structurally indistinguishable from full assertions.</Paragraph>
    <Paragraph position="1"> Fragments present parsing problems because the ellipsis creates degenerate structures. For example, * sequence such as cheer negative can be analysed as a 'sero-copuia' fragment meaning the chest X-ray im negative, or * noun compound llke tKe nefative of the ehe,L This is compounded by the lack of deriv*tional and inflectional morphology in English, so that in many cases it may not be possible to distinguish * noun from * verb (repair parts) or a past tense from a past participle (decreased medication). Adding fragment definitions to the grammar (especially if determiner om;Mion is *\]so allowed) results in *n explosion of ambiguity. This problem has been noted and discussed by Kwasny and Sondheimer ~wasny1981\]. Their solution to the problem is to suggest special relax**ion techniques for the analysis of fragments. However, in keeping with our thesis that fragments are normal constructions, we have chosen the alternative of constraining the explosion of parses in two ways.</Paragraph>
    <Paragraph position="2"> The first is the addition of * control structure to implement a i;m;ted form of preference via 'unbacktr*ckable' or (xor). This binary operator tries its second argument only if its first argument does not lead to * parse. In the grammar, this is used to prefer &amp;quot;the most structured&amp;quot; alternative. That is, full assertions are preferred over fragments - if an assertion or other non-fragment parse is obtained, the parser does not try for * fragment parse.</Paragraph>
    <Paragraph position="3"> The second mechanism that helps to control generation of incorrect parses is selection. PUNDIT applies surface selectlonal constraints incrementally, as the parse is built up ~ang1988\]. For example, the phrase air compressor would NOT be allowed as * serocopnla because the construction air is eompree#or would fall selection, s 8.1. Fragment Types The fragment types currently treated in PUNDIT include the following: Zerocopula: a subject followed by * predicate, differing from a full clause only in the absence of * verb, as in ImpeUor blade tip erosion eviden~ Tvo (tensed verb + object): a sentence m;~ing its subject, as in Believe the coupling from diesel to lac lube oil pump to be reheated; s Similarly, the assertion parse for the title of this paper would fail selection (sentences don't frngment structures), permitting the serocopuin fragment pLrse.</Paragraph>
    <Paragraph position="4"> Nst~.ag: an isolated noun phrase (noun-string fragment), as in Lou o/o~ primp preuure.</Paragraph>
    <Paragraph position="5"> ObJlze_frag (object-of-be fragment): an isolated complement appropriate to the main verb be, as in Unable to eonJ.tenffy Itart nr lb gaa turbine; Predicate: an isolated complement appropriate to a~ary be, as in Believed due to worn b~hingJ, where the full sentence counterpart is Failure 14 believed (to be) due to uorn b~hlnfm; s Obj..gap_flea&amp;qnent: a center (assertion, question, or other fragment structure) mining an obligatory noun phrase object, as in Field engineer t~l replace_ Note that we do not address here the processing of reapon~e frafmen~ which occur in interactive discourse, typically as responses to questions.</Paragraph>
    <Paragraph position="6"> The relative frequency of these six fragment types (expressed as a percentage of the total fragment content of each corpus) is summarised  The processing of these basic fragment types can be svmm~rlsed briefly as follows: a detailed surface parse tree is provided which represents the overt lexical content in its surface order. At this level, fragments bear very little resemblance to full assertions. But at the level of the Intermediate S~/ntac~e Representation (ISR), s It is interesting to note that at least some of these types of fragments resemble non-frnsmentary structures in other languages, two fragments, for m--Lmple, can be compared to sero-subject sentences in Japanese, seroeopulas resemble copular sentences in Arabic and Russian, and struetures similar to predlcate can be found in Cantonese (our thanks to K. Fu for the Cantonese data). This being the case, it is not surprising that analozoue sentences in Englkh can be processed without resorting to extra~immnticzd mechanismsc</Paragraph>
  </Section>
  <Section position="6" start_page="7" end_page="11" type="metho">
    <SectionTitle>
4 ZC -- serocopula; NF =- ustg_fragment; PRED -,
</SectionTitle>
    <Paragraph position="0"> predicate; OBJBE ,- objba_frag; OBJ_GAP obj..L~p_fraEment. null which is a regularized representation of syntactic structure ~)ah11987..\], fragments are regularized to paranel full assertions by the use of dummy elements standing in for the mlasing subject or verb. The CONTENT of these dummy elements, however, is left unspecified in most cases, to be filled in by the semantic or pragmatic components of the system.</Paragraph>
    <Paragraph position="1"> Tvo. We consider first the tvo, a subjectless tensed clause such as Operate, norton/Ill. This is parsed as a sequence of tensed verb and object: no subject is inferred at the level of surface structure. In the ISR, the missing subject is fined in by the dnmmy element elided. At the level of the ISR, then, the fragment operates norma/f~/ differs from a full assertion such as \]t operates normaU~/ only by virtue of the element elided in place of sn overt pronoun. The element elided is asslgned a referent which subsequently fills a thematic role, exactly as if it were a pronoun; thus these two sentences get the same treatment from semantics and reference resolutlon~)ah11986, Palmer1988\]. null Elided subjects in the domains we have looked at often refer to the writer of the report, so one strategy for interpreting them might be simply to assume that the filler of the elided sub-Sect is the writer of the report. This simple strategy is not snlBclent in all cases. For example, in the CASREPS corpus we observe sequences such as the following, where the filler of the elided sub-Sect is provided by the previous sentence, and is clearly not the writer of the report.</Paragraph>
    <Paragraph position="2"> (i) Problem appears to be caused by one or more of two hydraulic valves. Requires disassembly and investigation.</Paragraph>
    <Paragraph position="3"> (2) Sac lube oll pressure decreases below alarm point approximately seven minutes after engagement. Believed due to worn bushings.</Paragraph>
    <Paragraph position="4"> Thus, it is necessary to be able to treat elided subjects as pronouns in order to handle these sentences. null The effect of an elided subject on subsequent focusing is the same as that of an overt pronoun. We demonstrated in section 2 that elided subjects, but not semantically implicit arguments, are expected loci (or forward-looklng centers \[Gross1988\]) for later sentences.</Paragraph>
    <Paragraph position="5">  The basic assumption underlying this treatment is that the pragmatic analysis for elided subjects should be as re;re;far to that of pronouns as possible. One piece of supporting evidence for this assumption is that in many languages, such as Japanese \[Gundel1980, l-nnds1983, Kameyama1985\] the functional equivalent of unstressed pronouns in English is a sere, or elided noun phrase, s If seres in other languages can correspond to unstressed pronouns in English, then we hypothesise that seres in a sublunguage of English can correspond functionally to pronouns in standard English. In addition, since proceasing of pronouns is independently motlvated, it is a priori simpler to try to fit elision Into the pronominal paradigm, if possible, than to create an entirely separate component for handling elision. Under this hypothesis, then, tvo fragments represent 8~ply a realization of a grammatical strategy that is generally available to languages of the world, s Zeroeopula. For a serocopuia (e.g., D~Jk bad), the surface parse tree rather than the ISR inserts a dnmmy verb, In order to enforce sub-categorization constraints on the object. And In the ISR, this null verb is 'filled in' as the verb be. It is possible to fill in the verb at this level because no further semantic or pragmatic infor- null the absence of tense from the former. If the null verb represents an~llsLry be, then, like an overt an~I;ary, it does not appear in the regularised form. Sac .failing thus receives a regularisatlon with /ai/ as the main verb. Thus the null verb inserted in the syntax is treated in the ISR ill a fashion exactly parallel to the treatment of overt t Stressed pronouns in Eugiish corrupond to overt pronouns in lanzua,res like Japanese. u discummd in \[Gundell980, Gundellg81J, and \[Dahl1982J.</Paragraph>
    <Paragraph position="6"> t An interesting hypothesis, discussed by Gundel and Kameyama, is that the more topic prominent a language is, the more likely it is to have sero-NP's. Perhaps the fact that sublangusge mumn~J are characterised by rigid, contextualiy supplied, topics contributes to the availability of the rye fragment type in English.</Paragraph>
    <Paragraph position="7"> 7 In some restricted subdomains, however, other verbs may be omitted: for example, in certain radiology reports an omitted verb may be interpreted u ,hew rather than be. Hence we find Chemf Fdm* 1/.10 tittle cAa~e, paraphruable as Che#t .Fdme show Htffe cA~sge.</Paragraph>
    <Paragraph position="8"> occurrences of 6c.</Paragraph>
    <Paragraph position="9"> Nstg-.~ag. The syntactic parse tree for this fragment type contains no empty elements; it is a regular noun phrase, labeled as an nstg_f~aK. The ISR transforms it into a VSO sequence. This is done by treating it as the sub-Sect of an element empty_verb; in the semantic component, the subject of empty_verb is treated as the sole argument of a predicate exlstentlsl(X). As a result, the nstg_frag Fai/ure o\[ see and a synonymous assertion such as Failure o.f sac occurred are eventually mapped onto s;rnil~r final representations by virtue of the temporal semantics of empty_verb and of the bead of the noun phrase.</Paragraph>
    <Paragraph position="10"> Objbe_/~ag and predicate. These are isointed complements; the same devices described above are utillsed in their processing. The surface parse tree of these fragment types contains no empty elements; as with seroeopula, the unteused verb be is inserted into the ISR; as with tvo, the dnr-my subject elided is also inserted in the ISR, to be filled in by reference resolution.</Paragraph>
    <Paragraph position="11"> Thus the simple adjective Inoperatiee will receive an ISR quite s;rn;lsr to that of .~e/,Ise/it ~ inoperative. null ObJ_gap_~agment. The final fragment type to be considered here is the elided noun phrase object. Such object elisioca occur more widely in English in the context of instructions, as in Handle _ udtA sere. Cookbooks are especially well-known respositories of elided objects, presumably because they are filled with instructions. Object elision also occurs in telegrarnmatic sub-languages generally, as in Took _ under .~re ud~ m,e~es from the Navy sighting messages. If these omissions occurred only in direct object position following the verb, one might argue for a lexlcal treatment; that is, such omissions could be treated as a lexlcal process of intransitivisation rather than by explicitly representing gaps in the syntactic structure. However, noun phrase objects of prepositions may also be omitted, as in FraCas. Do not tamper ~th _. Thus we have chosen to represent such elislons with an explicit surface structure gap. This gap is permitted in most contexts where nstKo (noun phrase object) is found: as a direct object of the verb and as an object of a preposition. 8 In PUNDIT, elided objects are s Note, however, that there are some restrictions on the occurrence of these elements. They seem not to occur in  permitted only in a fragment type called obj_gap_fkagment, which, llke other fragment types, may be attempted only if an assertion parse has failed. Thus a sentence such as Pressure was c/stressing rap~ffy will never be analysed as containing an elided object, because there is a semantically acceptable assertion parse. In contrust, Johts ~as deere~inf gr~uag\[I/ will receive an elided object analysis, paraphrasable as Joh~ w~ deere~i~f IT gradua~v, because Jo~n is not an acceptable subject of intransitive Jeere~e; only pressure or some equally mensurable entity may be said to decrease. This selectional failure of the assertion parse permits the elided object analysis.</Paragraph>
    <Paragraph position="12"> Our working hypothesis for determ;u;uS the reference of object gaps is that they are, just llke subject gaps, appropriately treated as pronouns.</Paragraph>
    <Paragraph position="13"> However, we have not as yet seen extensive data relevant to this hypothesis, and it remains subject to further testing.</Paragraph>
    <Paragraph position="14"> These, then, are the fragment types currently Inzplemented In PUNDIT. As mentioned above, we do not consider noun phrases without determ;-ers to be fragments, because it is not clear that the missing element is symf~f~e~y obligatory. The Interpretation of these noun phrases is treated as a pragmatic problem. In the style of speech characteristic of the CASREPs, determ;uers are nearly always omitted. Their function must therefore be replaced by other mechanisms. One possible approach to this problem would be to have the system try to determine what the determ;uer would have been, had there been one, insert it, and then resume processing as if the detervn;ner had been there all along. This approach was taken by ~V\[arsh1981\]. However, it was rejected here for two reasons. The first is that it was judged to be more error-prone than simply equipping the reference resolution component with the ability to handle noun phrases without determiners directly. 0 The second reason predicative objects, in double dative constructions, and, perhaps, in sentence adjuncts rather than arguments of the verb. (Thus compare P4fiesf eertf d/..Do sot opersfe os with Opersti~ room cloud os Snadslt. Do nor pe~om ~rgcIT oz..) One po~ibility is that these expreruione can occur only where a definite pronoun would also be acceptable. In general, object pps seem mcet acceptable where they represent an argument ot n verb, either as direct object or u object of a preposition selected for by a verb.</Paragraph>
    <Paragraph position="15"> This ability would be required in any case, should the system be extended to process languages which do not have for not selecting this approach is that it would el|m;uate the distinction between noun phrases which originally had a determiner and those which did not. At some point in the development of the system it may become necessary to use this informationdeg The basic approach currently taken is to assume that the noun phrase is definite, that is, it triggers a search through the discourse context for a previously mentioned referent. If the search succeeds, the noun phrase is assumed to refer to that entity. If the search fans, z new discourse entity is created.</Paragraph>
    <Paragraph position="16"> In summary, then, these fragment types are parsed 'as is' at the surface level; dummy elements are inserted Into the ISR to bring fragments into close parallelism with fuil assertions.</Paragraph>
    <Paragraph position="17"> Because of the resulting structural s;m;l~rlty between these two sentence types, the semantic and pragmatic components can apply exactly the same Interpretive processes to both fragments and assertions, using preexisting mechanisms to 'flu In' the holes detected by syntax.</Paragraph>
  </Section>
  <Section position="7" start_page="11" end_page="12" type="metho">
    <SectionTitle>
4. TEMPORAL ANALYSIS OF FI~G-
MENTS
</SectionTitle>
    <Paragraph position="0"> Temporal processing of fragmentary sentences further supports the efficacy of a modular approach to the analysis of these strings. 1deg In PUNDIT'S current message domains, a single assumption leads to assignment of present or past tense in untensed fragments, depending on the nspectual properties of the fragment, lz This assumption is that the messages report on actual situations which are of present relevance. Consequently, the default tense assignment is present unless th~ prevents assigning an actual time. 1~ For sentences having progressive grammatical aspect or statlve lexical aspect, the assignment of present tense always permits interpreting articl~  u$ince the rye fragment is tensed, its input to the time component is indistinguishable from that of a full mntence. z~Pundit do~ not currently take full advantage of modifier information that could indicate whether a situation has real time associated with it (e.,r, pot4ntial sac tinware), or whether a situation is past or present (e.g., sac 1~ure yenteeday; pump now opera/~ng so~m~y).</Paragraph>
    <Paragraph position="1">  a situation as having an actual time ~assonneau1987\]. Thus, * present tense reading is always assigned to an untensed progressive fragment, such as pressure decreasing; or an untensed serocopula with * non-partlclplal complement, such as pump i~operatlee.</Paragraph>
    <Paragraph position="2"> A non-progressive serocopula fragment containing * cognitive state verb, as in /a~ure believed due to wow bushings, is assigned * present tense reading. However, if the lexlc*l verb has non-stative aspect, Is e.g., tss~ eomluetsd (process) or new sac received (transition event) then assignment of present tense conflicts with the assumption that the mentioned situation has occurred or is occurring. The slmple present tense form of verbs in this class is given * habitual or iterative reading. That is, the corresponding full sentences in the present, tss~ are conducted and nelo sac ~ reeelved, are interpreted as referring to types of situations that tend to occur, rather than to situations that have occurred. In order to permit actual temporal reference, these fragments are assigned * past tense reading.</Paragraph>
    <Paragraph position="3"> Nst~/~ag represents another case where present tense may conflict with lexical aspect. If * n nmtg_frag refers to * non-st*tire situation, the situation is interpreted as having an actual past time. This can be the case if the head of the noun phrase is * nom;nallsation, and is derived from * verb in the process or tr*nsltlon event aspectual class. Thus, ineestlgation of problem would be interpreted as an actual process which took place prior to the report time, and ~irnilurly, sac/ai/ure would be interpreted *s * past transit|on event. On the other hand, an nstff~raJC/ which refers to * st*tire situation, as in i~opera~iee pump, is assigned present tense.</Paragraph>
  </Section>
  <Section position="8" start_page="12" end_page="13" type="metho">
    <SectionTitle>
5. RELATION OF FRAGMENTS TO THE
LARGER G~
</SectionTitle>
    <Paragraph position="0"> An important finding which has emerged from the investigation of sentence fragments in a variety of sublanguage domains is that the linguistic properties of these constructions are largely domain-independent. A~nrn|rlg that these sentence fragments remain constant across different sublanguages, what is their relationship to the language at large? As indicated above, we Is Mourelat~' class of occurrences \[Mourelatoslg81\]. believe that fragments should not be regarded as ERRORS, * position taken also by ~ehrberger1982, Marsh1983\], and others. Fragments do occur with disproportionate frequency in some domains, such as field reports of mechanical failure or newspaper headlines. However, despite this frequency v*riatlon, it appears that the parser's preferences remain constant *cross domains.</Paragraph>
    <Paragraph position="1"> Therefore, even in telegraphic domains the preference is for * full assertion parse, if one is available. As discussed above, we have enforced this preference by means of the xor ('unbacktrackable' or) connective. Thus despite the greater frequency of fragments we do not require either * gr*mm*r or * preference structure different from that of standard English in order to apply the stable system ~rammlr to these telegraphic messages. null Others have argued against this view of the relationship between sublanguages and the language at large. For example, Fitspatrlck et al.</Paragraph>
    <Paragraph position="2"> ~itspatrick1986\] propose that fragments are sub-ject to * constraint quite unlike any found in English generally. Their Tr*n*ltlvity Constraint (TC) requires that if * verb occurs as * transitive in * sublanguage with fragmentary messages, then it may not also occur in an intransitive form, even if the verb is ambiguous in the language at large. This constraint, they argue, provides evidence that sublanguage gramm,,rs have &amp;quot;* llfe of their own&amp;quot;, since there is no such principle governing standard languages. The TC would also cut down on ambiguities arising out of object deletion, since * verb would be permitted to occur transitively or intransltlve\]y in * given subdomain, but not both.</Paragraph>
    <Paragraph position="3"> As the authors recogulse, this hypothesis runs into tllt~culty in the face of verbs such as resume (we find both Sac resumed norm~ operatlon and No~e \]~am resumed), since resume occurs both transitively and intransitively in these cases.</Paragraph>
    <Paragraph position="4"> For these cases, the authors are forced to appeal to a problematic analysis of resume as syntacticaliy transitive in both cases; they analyse TKe ~o~e /sue resumed, for example, as deriving from a structure of the form CSomeone/aomethingJ resumed tKc nose; that is, it is analysed as underlyingiy transitive. Other transitivity alternations which present potential counter-examples are treated as syntactic gapping processes. In fact, with these two mechanisms available, it is not clear what COULD provide a counter-example to  the TC. The effect of all this insulation is to render the Transitivity Constraint vacuous. If all trans|tive/intranslt|ve alternations can be treated as underlying|y transitive, then of course there win be no counter-examples to the transitivity constraint. Therefore we see no evidence that sublanguage grammars are subject to additional constraints of this nature.</Paragraph>
    <Paragraph position="5"> In snmm*ry, this supports the view that fragmentary constructions in English are regular, gramm~t|caliy constrained ellipses differing minimally from the standard language, rather than ill-formed, unpredictable sublanguage exotlca. ~Vithln a modular system such as PUNDIT this regularity can be captured with the l~rn~ted augmentations of the grammsr described above.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML