File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/00/j00-4004_abstr.xml
Size: 5,647 bytes
Last Modified: 2025-10-06 13:41:41
<?xml version="1.0" standalone="yes"?> <Paper uid="J00-4004"> <Title>Learning Methods to Combine Linguistic Indicators: Improving Aspectual Classification and Revealing Linguistic Insights</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> Aspectual classification maps clauses (e.g., simple sentences) to a small set of categories in order to reason about time. For example, events, such as, You called your father, are distinguished from states, such as, You resemble your father. The ability to distinguish stative clauses from event clauses is a fundamental component of natural language understanding. These two high-level categories correspond to fundamental distinctions in many domains, including the distinctions between diagnosis and procedure in the medical domain, and between analysis and activity in the financial domain.</Paragraph> <Paragraph position="1"> Stativity is the first high-level distinction made when defining the aspectual class of a clause. Events are further distinguished according to completedness (sometimes called telicity), which determines whether an event reaches a culmination or completion point at which a new state is introduced. For example, I made afire is culminated, since a new state is introduced--something is made, whereas I gazed at the sunset is nonculminated.</Paragraph> <Paragraph position="2"> * Computer Science Dept., 1214 Amsterdam Ave., New York NY 10027. E-mail: evs@cs.columbia.edu t Computer Science Dept., 1214 Amsterdam Ave., New York, NY 10027. E-mail: kathy@cs.columbia.edu @ 2001 Association for Computational Linguistics Computational Linguistics Volume 26, Number 4 Aspectual classification is necessary for interpreting temporal modifiers and assessing temporal entailments (Moens and Steedman 1988; Dorr 1992; Klavans 1994) and is therefore a required component for applications that perform certain natural language interpretation, generation, summarization, information retrieval, and machine translation tasks. Each of these applications requires the ability to reason about time.</Paragraph> <Paragraph position="3"> A verb's aspectual category can be predicted by co-occurrence frequencies between the verb and linguistic phenomena such as the progressive tense and certain temporal modifiers (Klavans and Chodorow 1992). These frequency measures are called linguistic indicators. The choice of indicators is guided by linguistic insights that describe how the aspectual category of a clause is constrained by the presence of these modifiers. For example, an event can be placed in the progressive, as in, She was jogging, but many stative clauses cannot, e.g., *She was resembling her father (Dowty 1979). One advantage of linguistic indicators is that they can be measured automatically.</Paragraph> <Paragraph position="4"> However, individual linguistic indicators are predictively incomplete, and are therefore insufficient when used in isolation. As demonstrated empirically in this article, individual linguistic indicators suffer from limited classification performance due to several linguistic and pragmatic factors. For example, some indicators were not motivated by specific linguistic insights. However, linguistic indicators have the potential to interact and supplement one another, so it would be beneficial to combine them systematically.</Paragraph> <Paragraph position="5"> In this article, we compare three supervised machine learning methods for combining multiple linguistic indicators for aspectual classification: decision trees, genetic programming, and logistic regression. A set of 14 indicators are combined, first for classification according to stativity, and then for classification according to completedness. This approach realizes the potential of linguistic indicators, improving classification performance over a baseline method for both tasks with minimal overfitting, as evaluated over an unrestricted set of verbs occurring in two corpora. This serves to demonstrate the effectiveness of these linguistic indicators and thus provides a much-needed full-scale, expandable method for automatic aspectual classification.</Paragraph> <Paragraph position="6"> The results of learning are linguistically viable in two respects. First, learning automatically produces models that are specialized for different aspectual distinctions; the same set of 14 indicators are combined in different ways according to which classification problem is targeted. Second, inspecting the models resulting from learning revealed linguistic insights that are relevant to aspectual classification.</Paragraph> <Paragraph position="7"> We also evaluate an unsupervised method for this task. This method uses co-occurrence statistics to group verbs according to meaning. Although this method groups verbs generically and is not designed to distinguish according to aspectual class in particular, we show that the results do distinguish verbs according to stativity. The next two sections of this article describe aspectual classification and linguistic indicators. Section 4 describes the three supervised learning methods employed to combine linguistic indicators for aspectual classification. Section 5 gives results in terms of classification performance and resulting linguistic insights, comparing these results, across classification tasks, to baseline methods. Section 6 describes experiments with an unsupervised approach. Finally, Sections 7, 8, and 9 survey related work, describe future work, and present conclusions.</Paragraph> </Section> class="xml-element"></Paper>