File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/91/m91-1002_intro.xml

Size: 2,247 bytes

Last Modified: 2025-10-06 14:05:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="M91-1002">
  <Title>MUC-3 EVALUATION METRIC S</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
INTRODUCTION
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Purpos e
</SectionTitle>
      <Paragraph position="0"> The MUC-3 evaluation metrics are measures of performance for the MUC- 3 template fill task. Obtaining summary measures of performance necessitates the los s of information about many details of performance . The utility of summary measure s for comparison of performance over time and across systems should outweigh thi s loss of detail . The template fill task is complex because of the varying nature of th e fills for each slot and the interdependencies of the slots . The evaluation metrics used in MUC-3 were adapted from traditional measures in information retrieval and signa l procesing and were still evolving to fit the more complex data extraction task of MUC-3 when the evaluation was performed . The scoring of the template fill task and th e calculation of the metrics used in MUC-3 will be described here . This description i s meant to assist in the analysis of the MUC-3 results and in the further evolution of the evaluation metrics .</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Metric s
</SectionTitle>
      <Paragraph position="0"> The measures of performance chosen for use in MUC-3 were recall, precision , fallout, and overgeneration .</Paragraph>
      <Paragraph position="1"> Recall, precision, and fallout were adapted based o n their use in information retrieval . Overgeneration was developed as a measure fo r MUC-3 . Recall is a measure of the completeness of the template fill . Precision is a measure of the accuracy of the fill . Fallout is a measure of the false alarm rate fo r the slots which can be filled from finite sets of slot fillers .</Paragraph>
      <Paragraph position="2"> Overgeneration is a measure of spurious generation .</Paragraph>
      <Paragraph position="3"> These measures will be described in greater detai l below .</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML