File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-0607_evalu.xml

Size: 2,402 bytes

Last Modified: 2025-10-06 13:59:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0607">
  <Title>Feeding OWL: Extracting and Representing the Content of Pathology Reports</Title>
  <Section position="6" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
4 Evaluation
</SectionTitle>
    <Paragraph position="0"> At the moment, we have only evaluated the modules individually, and--since the system is still under developement--this evaluation only provides a snapshot of the current state of developement. A full-scale evaluation of the whole system in its application context is planned as soon as the modules are finalised; plans for this are discussed below.</Paragraph>
    <Paragraph position="1"> The coverage of the morphology module and the POS-tagger have already been reported above, so we concentrate here on the chunk-parser. To evaluate this module, we have manually annotated the NPs in a randomly selected test set of 20 reports (ca.</Paragraph>
    <Paragraph position="2"> 2,800 words; we found about 500 NPs). The reports were then morphologically analysed and POSfiltered, and the results were manually checked and corrected, to ensure that the input was optimal and really only the performance of the chunker was evaluated. We then computed precision and recall based on two different matching criteria: for exact matching, where only exact congruence of chunks counts, a precision of 48% and a recall of 63% was computed; the numbers improve when partial matches, i.e. smaller chunks within the target chunk, receive partial credit (by a factor of .25), resulting in a (relaxed) precision of 61% and a (relaxed) recall of 80%. This difference can be explained by the fact that some of the more complex NP-constructions (with quite complex modifications) in our data are not yet covered by the grammar, and only their constituent NPs are recognised.</Paragraph>
    <Paragraph position="3"> Note that this evaluation just takes into account the boundaries of the chunks and not the correctness of the computed semantic representations. For a full-scale evaluation, we will manually annotate these NPs with semantic representations, and we will use this to compute precision and recall also with respect to semantics, and ultimately with respect to sample search queries. This annotation, however, is very resource-intensive, and so will only be done once the modules have been finalised.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML