File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/p03-1009_intro.xml

Size: 2,235 bytes

Last Modified: 2025-10-06 14:01:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-1009">
  <Title>Clustering Polysemic Subcategorization Frame Distributions Semantically</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Semantic Verb Classes and Test Verbs
</SectionTitle>
    <Paragraph position="0"> Levin's taxonomy of verbs and their classes (Levin, 1993) is the largest syntactic-semantic verb classification in English, employed widely in evaluation of automatic classifications. It provides a classification of 3,024 verbs (4,186 senses) into 48 broad / 192 fine grained classes. Although it is quite extensive, it is not exhaustive. As it primarily concentrates on verbs taking NP and PP complements and does not provide a comprehensive set of senses for verbs, it is not suitable for evaluation of polysemic classifications. null We employed as a gold standard a substantially extended version of Levin's classification constructed by Korhonen (2003). This incorporates Levin's classes, 26 additional classes by Dorr (1997)1, and 57 new classes for verb types not covered comprehensively by Levin or Dorr.</Paragraph>
    <Paragraph position="1"> 110 test verbs were chosen from this gold standard, 78 polysemic and 32 monosemous ones. Some low frequency verbs were included to investigate the 1These classes are incorporated in the 'LCS database' (http://www.umiacs.umd.edu/[?]bonnie/verbs-English.lcs). effect of sparse data on clustering performance. To ensure that our gold standard covers all (or most) senses of these verbs, we looked into WordNet (Miller, 1990) and assigned all the WordNet senses of the verbs to gold standard classes.2 Two versions of the gold standard were created: monosemous and polysemic. The monosemous one lists only a single sense for each test verb, that corresponding to its predominant (most frequent) sense in WordNet. The polysemic one provides a comprehensive list of senses for each verb. The test verbs and their classes are shown in table 1. The classes are indicated by number codes from the classifications of Levin, Dorr (the classes starting with 0) and Korhonen (the classes starting with A).3 The predominant sense is indicated by bold font.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML