File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-2606_intro.xml

Size: 4,165 bytes

Last Modified: 2025-10-06 14:02:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2606">
  <Title>Extended Lexical-Semantic Classi cation of English Verbs</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Lexical-semantic classes which aim to capture the close relationship between the syntax and semantics of verbs have attracted considerable interest in both linguistics and computational linguistics (e.g. (Pinker, 1989; Jackendoff, 1990; Levin, 1993; Dorr, 1997; Dang et al., 1998; Merlo and Stevenson, 2001)). Such classes can capture generalizations over a range of (cross-)linguistic properties, and can therefore be used as a valuable means of reducing redundancy in the lexicon and for lling gaps in lexical knowledge.</Paragraph>
    <Paragraph position="1"> Verb classes have proved useful in various (multilingual) natural language processing (NLP) tasks and applications, such as computational lexicography (Kipper et al., 2000), language generation (Stede, 1998), machine translation (Dorr, 1997), word sense disambiguation (Prescher et al., 2000), document classi cation (Klavans and Kan, 1998), and subcategorization acquisition (Korhonen, 2002). Fundamentally, such classes de ne the mapping from surface realization of arguments to predicate-argument structure and are therefore a critical component of any NLP system which needs to recover predicate-argument structure. In many operational contexts, lexical information must be acquired from small application- and/or domain-speci c corpora. The predictive power of classes can help compensate for lack of sufcient data fully exemplifying the behaviour of relevant words, through use of back-off smoothing or similar techniques. null Although several classi cations are now available for English verbs (e.g. (Pinker, 1989; Jackendoff, 1990; Levin, 1993)), they are all restricted to certain class types and many of them have few exemplars with each class. For example, the largest and the most widely deployed classi cation in English, Levin's (1993) taxonomy, mainly deals with verbs taking noun and prepositional phrase complements, and does not provide large numbers of exemplars of the classes. The fact that no comprehensive classi cation is available limits the usefulness of the classes for practical NLP.</Paragraph>
    <Paragraph position="2"> Some experiments have been reported recently which indicate that it should be possible, in the future, to automatically supplement extant classi cations with novel verb classes and member verbs from corpus data (Brew and Schulte im Walde, 2002; Merlo and Stevenson, 2001; Korhonen et al., 2003). While the automatic approach will avoid the expensive overhead of manual classi cation, the very development of the technology capable of large-scale automatic classi cation will require access to a target classi cation and gold standard exempli cation of it more extensive than that available currently.</Paragraph>
    <Paragraph position="3"> In this paper, we address these problems by introducing a substantial extension to Levin's classi cation which incorporates 57 novel classes for verbs not covered (comprehensively) by Levin. These classes, many of them drawn initially from linguistic resources, were created semi-automatically by looking for diathesis alternations shared by candidate verbs. 106 new alternations not covered by Levin were identi ed for this work. We demonstrate the usefulness of our novel classes by using them to improve the performance of our extant subcategorization acquisition system. We show that the resulting extended classi cation has good coverage over the English verb lexicon. Discussion is provided on how the classi cation could be further re ned and extended in the future, and integrated as part of Levin's extant taxonomy.</Paragraph>
    <Paragraph position="4"> We discuss Levin's classi cation and its extensions in section 2. Section 3 describes the process of creating the new verb classes. Section 4 reports the experimental evaluation and section 5 discusses further work. Conclusions are drawn in section 6.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML