File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/97/a97-1052_relat.xml

Size: 2,690 bytes

Last Modified: 2025-10-06 14:16:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="A97-1052">
  <Title>Corpus Data TP FP FN</Title>
  <Section position="6" start_page="361" end_page="361" type="relat">
    <SectionTitle>
4 Related Work
</SectionTitle>
    <Paragraph position="0"> Brent's (1993) approach to acquiring subcategorization is based on a philosophy of only exploiting un-ambiguous and determinate information in unanalysed corpora. He defines a number of lexical patterns (mostly involving closed class items, such as pronouns) which reliably cue one of five subcategorization classes. Brent does not report comprehensive results, but for one class, sentential complement verbs, he achieves 96% precision and 76% recall at classifying individual tokens of 63 distinct verbs as exemplars or non-exemplars of this class. He does not attempt to rank different classes for a given verb.</Paragraph>
    <Paragraph position="1"> Ushioda et al. (1993) utilise a PoS tagged corpus and finite-state NP parser to recognize and calculate the relative frequency of six subcategorization classes. They report an accuracy rate of 83% (254 errors) at classifying 1565 classifiable tokens of 33 distinct verbs in running text and suggest that incorrect noun phrase boundary detection accounts for the majority of errors. They report that for 32 verbs their system correctly predicts the most frequent class, and for 30 verbs it correctly predicts the second most frequent class, if there was one. Our system rankings include all classes for each verb, from a total of 160 classes, and average 81.4% correct.</Paragraph>
    <Paragraph position="2"> Manning (1993) conducts a larger experiment, also using a PoS tagged corpus and a finite-state NP parser, attempting to recognize sixteen distinct complementation patterns. He reports that for a test sample of 200 tokens of 40 verbs in running text, the acquired subcategorization dictionary listed the appropriate entry for 163 cases, giving a token recall of 82% (as compared with 80.9% in our experiment).</Paragraph>
    <Paragraph position="3"> He also reports a comparison of acquired entries for the verbs to the entries given in the Oxford Advanced Learner's Dictionary of Current English (Hornby, 1989) on which his system achieves a precision of 90% and a recall of 43%. His system averages 3.48 subentries (maximum 10)--less then half the number produced in our experiment. It is not clear what level of evidence the performance of Manning's system is based on, but the system was applied to 4.1 million words of text (c.f. our 1.2 million words) and the verbs are all common, so it is likely that considerably more exemplars of each verb were available.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML