File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/p01-1038_intro.xml
Size: 3,325 bytes
Last Modified: 2025-10-06 14:01:12
<?xml version="1.0" standalone="yes"?> <Paper uid="P01-1038"> <Title>Generation of VP Ellipsis: A Corpus-Based Approach</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> While there is a vast theoretical and computational literature on the interpretation of elliptical forms, there has been little study of the generation of ellipsis.1 In this paper, we focus on Verb Phase Ellipsis (VPE), in which a verb phrase is elided, with an auxiliary verb left in its place. Here is an example: (1) In 1980, 18% of federal prosecutions concluded at trial; in 1987, only 9% did.</Paragraph> <Paragraph position="1"> Here, the verb phase concluded at trial is omitted, and the auxiliary did appears in its place. The 1We would like to thank Marilyn Walker, three reviewers for a previous submission, and three reviewers for this submission for helpful comments.</Paragraph> <Paragraph position="2"> basic condition on VPE is clear from the literature:2 there must be an antecedent VP that is identical in meaning to the elided VP. Furthermore, it seems clear that the antecedent must be sufficiently close to the ellipsis site (in a sense to be made precise).</Paragraph> <Paragraph position="3"> This basic condition provides a beginning of an account of the generation of VPE. However, there is more to be said, as is shown by the following examples: (2) Ernst & Young said Eastern's plan would miss projections by $100 million. Goldman said Eastern would miss the same mark by at least $120 million.</Paragraph> <Paragraph position="4"> In this example, the italicized VP could be elided, since it has a nearby antecedent (in bold) with the same meaning. Indeed the antecedents in this example is closer than in the following example in which ellipsis does occur: (3) In particular Mr Coxon says businesses are paying out a smaller percentage of their profits and cash flow in the form of dividends than they have VPE historically.</Paragraph> <Paragraph position="5"> In this paper, we identify factors which govern the decision to elide VPs. We examine a corpus of positive and negative examples; i.e., examples in which VPs were or were not elided. We find that, indeed, the distance between ellipsis site and antecedent is correlated with the decision to elide, as are the syntactic relation between antecedent 2The classic study is (Sag, 1976); for more recent work, see, eg, (Dalrymple et al., 1991; Kehler, 1993; Fiengo and May, 1994; Hardt, 1999).</Paragraph> <Paragraph position="6"> and ellipsis site, and the presence or absence of adjuncts. Building on these results, we use machine learning techniques to examine where in the generation architecture a trainable algorithm for VP ellipsis should be located. We show that the best performance (error rate of 7.5%) is achieved when the trainable module is located after the realizer and has access to surface-oriented features. In what follows, we first describe our corpus of negative and positive examples. Next, we describe the factors we coded for. Then we give the results of the statistical analysis of those factors, and finally we describe several algorithms for the generation of VPE which we automatically acquired from the corpus.</Paragraph> </Section> class="xml-element"></Paper>