File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/93/w93-0306_evalu.xml

Size: 2,107 bytes

Last Modified: 2025-10-06 14:00:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="W93-0306">
  <Title>NPtool~ a detector of English noun phrases *</Title>
  <Section position="7" start_page="54" end_page="55" type="evalu">
    <SectionTitle>
6 Performance of NPtool
</SectionTitle>
    <Paragraph position="0"> Various kinds of metrics can be proposed for the evaluation of a noun phrase extractor; our main metrics are recall and precision, defined as followslU: * Recall: the ratio 'retrieved intended NPs '17 / 'all intended NPs' * Precision: the ratio 'all retrieved NPs' / 'retrieved intended NPs' 14This figure also covers errors due to previous stages of analysis.</Paragraph>
    <Paragraph position="1"> zSi.e. * parser which does not contain the mechanism for penalising or fsvouring noun phrue analyses; see Section 4.3 *hove.</Paragraph>
    <Paragraph position="2"> 16Thls definition also agrees with that used in Rausch k al. \[1992\].</Paragraph>
    <Paragraph position="3"> 17An 'intended NP' is the longest non-overlapping match of the C/eaxch query given in extraction phue.  To paraphrase, a recall of less than 100% indicates that the system missed some of the desired noun phrases, while a precision of less than 100% indicates that the system retrieved something that is not regarded as a correct result.</Paragraph>
    <Paragraph position="4"> The performance of the whole system has been evaluated against several texts from different domains. In all, the analysis of some 20,000 words has been manually checked.</Paragraph>
    <Paragraph position="5"> If we wish to extract relatively complex noun phrases with optional coordination, premodifiers and postmodifiers (see the search query above in Section 4.4), we reach a recall of 98.5-100%, with a precision of some 95-98%.</Paragraph>
    <Paragraph position="6"> As indicated in Section 4.4, the extraction utility annotates each proposed noun phrase as a 'sure hit' ('ok:') or as an 'uncertain hit' ('?:'). This distinction is quite useful for manual validation: approximately 95% of all superfluous noun phrase candidates are marked with the question mark.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML