File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/95/m95-1011_intro.xml

Size: 2,281 bytes

Last Modified: 2025-10-06 14:05:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="M95-1011">
  <Title>DESCRIPTION OF THE UMASS SYSTEM AS USED FOR MUC- 6</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> Information extraction research at the University of Massachusetts is based on portable, trainable languag e processing components. Some components are more effective than others, some have been under developmen t longer than others, but in all cases, we are working to eliminate manual knowledge engineering . Although UMas s has participated in previous MUC evaluations, all of our information extraction software has been redesigned an d rewritten since MUC-5, so we are evaluating a completely new system this year .</Paragraph>
    <Paragraph position="1"> In particular, we are working with new string recognition specialists (for Named Entities), a new part-of-speech tagger, a new sentence analyzer, a new and fully automated dictionary construction algorithm, a new discours e analyzer, and a new coreference analyzer. The most interesting components in our system are CRYSTAL (which generates a concept node dictionary) [13], WRAP-UP (which establishes relational links between entities) [14, 15 , 16], and RESOLVE (the coreference analyzer) [8] . Each of these components utilizes machine learning techniques i n order to customize crucial extraction capabilities based on representative training texts .</Paragraph>
    <Paragraph position="2"> Our preparations for MUC-6 began on June 19 (at the release of the Call for Participation) and ended on Octobe r 2 when we began our test runs . All of our ST-specific training began in September with the release of the ST keys .</Paragraph>
    <Paragraph position="3"> As much as we try to exploit trainable technologies, there are nevertheless places where some amount of manua l coding is still needed . For example, we needed to write string manipulation functions to trim output strings in a n effort to generate slot fills consistent with the MUC-6 slot fill guidelines . We also needed to create template parsers and text marking interfaces in order to map the MUC-6 training documents into data usable by our trainabl e components.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML