File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/95/m95-1003_intro.xml
Size: 1,532 bytes
Last Modified: 2025-10-06 14:05:54
<?xml version="1.0" standalone="yes"?> <Paper uid="M95-1003"> <Title>FOUR SCORERS AND SEVEN YEARS AGO: The Scoring Method for MUC- 6</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> INTRODUCTION </SectionTitle> <Paragraph position="0"> The MUC-6 scoring method is based on a two-step process of mapping an item generated by a system under evaluation (the &quot;response&quot;) to the corresponding item in the human-generated answer key and then scoring th e mapped items . The resulting scores are used for decision-making over the entire evaluation cycle, includin g refinement of the task definition based on interannotator comparisons, technology development using training data , validating answer keys, and benchmarking both system and human capabilities on the test data .</Paragraph> <Paragraph position="1"> To further understand the scoring method, we will look at the features and algorithms embodied in each o f the scorers, showing their basic similarity and discussing the differences from task to task . We will show how critical the mapping algorithm is in scoring and the problems inherent in deciding what the best mapping should be . We will also discuss the result of translating the emacslisp scorers into C, including the increased accuracy of the mapping a s well as the increased speed and improved memory management . The positive effects on consumer capability will b e shown and future enhancements will be briefly outlined.</Paragraph> </Section> class="xml-element"></Paper>