File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/m98-1010_evalu.xml

Size: 9,208 bytes

Last Modified: 2025-10-06 14:00:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="M98-1010">
  <Title>DESCRIPTORS RECALL PRECISION</Title>
  <Section position="5" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
RESULTS ANALYSIS
</SectionTitle>
    <Paragraph position="0"> Overall, AATM7's scores for MUC-7 are good. There are a few errors, as well as some quirks of the MUC-7 domain, that will be discussed which significantly effected the scores for entity names and locations. The artifact scores are significantly below the NLToolset's usual performance; this is due to the newness of this entity, particularly of the space vehicle artifacts. This capability is still a work in progress, as the need arises for our real-world applications .</Paragraph>
    <Paragraph position="1">  Since the TE task spans four separate subtasks with very different characteristics, an analysis was done on each. The formal run keys were split into four sets: organization, person, artifact, and location keys. The formal run was then also split into organization, person, artifact, and location responses. Each set was then respectively scored with SAIC's version 3.3 of the MUC scoring program. The results are described below. This scoring method removes the mapping ambiguity between entities of different types and allows an accurate analysis of the performance of each individual entity type.</Paragraph>
    <Paragraph position="2">  traditionally been at less than 50%. To improve on this performance, one problem that could very easily be resolved is an incorrect interpretation of expressions like &amp;quot;(NI FRX)&amp;quot; in the formal text. &amp;quot;NI&amp;quot; is a common first name in some languages and therefore, AATM7 interpreted all thirteen of these as person names. This error accounted for 13 of the overgenerated or incorrect person names, or the equivalent of 2 points of precision.</Paragraph>
    <Paragraph position="3"> Another area for improvement is in the descriptor slot. Twenty-six of AATM7's person descriptors were marked incorrect because they contained only the head of the noun phrase and not the entire phrase, e.g. &amp;quot;commander&amp;quot; instead of &amp;quot;Columbia's commander&amp;quot; and &amp;quot;manager&amp;quot; instead of &amp;quot;project manager.&amp;quot; The descriptor rule package will be improved to better encompass the entire phrase. If these descriptors had been extracted correctly for the MUC-7 test, the descriptor recall and precision would have improved to 70 and 63, while the overall person scores would have improved to 89 recall, 79 precision, and 83.7 F-measure.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Organizations
</SectionTitle>
      <Paragraph position="0"> Organizations are complex entities to determine in text because organization names have a more complex structure than person names. A variation algorithm for one name may not work for another. For example, &amp;quot;Hughes&amp;quot; is a valid variation for &amp;quot;Hughes Aerospace, Inc.&amp;quot; but &amp;quot;Space&amp;quot; is not a valid variation for &amp;quot;Space Technology Industries&amp;quot;. An automatic system must, therefore, look at the surrounding context of variations and filter out those that are spurious.</Paragraph>
      <Paragraph position="1"> AATM7 found 780 of the 877 organizations in the formal test corpus. Of the 780 it found, points were lost here and there for mistakes in two areas. First, current performance on organization descriptors is woefully inadequate and in sharp contrast to that on person descriptors. An effort is currently underway to improve this with the help of a part-of-speech tagger. Additionally, it was discovered that the mechanism for creating and linking variations of organization names was broken during the training period. The result of this was that 64 name variations were missed. When this problem was fixed, recall and precision for ent_name improved to 76 and 77, with the overall organization recall and</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Artifacts
</SectionTitle>
      <Paragraph position="0"> AATM7's artifact performance really suffers in the area of entity names. It missed almost half of the artifact entities purely from lack of patterns with which to recognize them. This is a sign of the immaturity of the artifact packages and can be overcome by more development.</Paragraph>
      <Paragraph position="1"> Another problem, which caused the low precision, was the result of incorrectly identifying the owner of the artifact as its name. This accounted for 38 of the spurious entity names and 2% of the precision. Since this is a new package, the coreference resolution is also not up to the NLToolset's usual performance. This is an on-going research effort.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Locations
</SectionTitle>
      <Paragraph position="0"> The NLToolset performs well at finding and disambiguating locations. Determining the country for a given location can be complicated since many named locations exist in multiple countries. A small number of minor changes have been identified to significantly boost the score to its normal level. One of the obvious problems AATM7 had was with the airports. Eleven occurrences of Kennedy Space Center were identified as locale type &amp;quot;CITY&amp;quot; instead of the correct type of &amp;quot;AIRPORT&amp;quot;. This was caused by a simple inconsistency in our location processing. Fixing this one problem, improved the airport-specific recall and precision to 57 and 67 respectively, and improved the precision overall by 1 percentage point.</Paragraph>
      <Paragraph position="1"> The location recall for MUC-7 is slightly depressed because of some challenges which this particular domain presented. AATM7 was not configured to process planet names or other extra-terrestrial bodies as locations. This accounted for sixty-three missing items, at three slots per item; thirty-one of the missing were occurrences of &amp;quot;earth&amp;quot; alone. This is reflected in the subtask scores for region and unk. By just adding these locations to the NLToolset's knowledge base, recall and precision was improved to 82 and 83 for the location object.</Paragraph>
      <Paragraph position="2"> Another quirk of the MUC-7 domain was that adjectival forms of nation names were to be extracted as location objects, if they were the only references to the nation in the text. In other words, if the text contains the phrase &amp;quot;the Italian satellite&amp;quot; but no other mention of Italy, a location object with the locale &amp;quot;Italian&amp;quot; would be extracted. This was not addressed in AATM7 and resulted in a loss of thirty-two location objects, at three slots per object. This feature could be added just for the MUC-7 test. It is unlikely that a real-world application would want this information extracted. If it is added, recall and precision for the location object rise to 86 and 84 with an overall F-measure of 85.</Paragraph>
      <Paragraph position="3">  AATM7 found all of the persons in the walkthrough document. Of the five person descriptors, it missed only two; it made a separate entity for one of the descriptors and found only part of the other. The other spurious person entity is really an organization (&amp;quot;ING Barings&amp;quot;) that was mistaken for a person, due to the fact that Ing is in the firstnames list. AATM7 did confuse another organization (&amp;quot;Bloomberg Business&amp;quot;) as a person because of the context (&amp;quot;the parent of&amp;quot;), but this was marked incorrect, instead of spurious, because it was mapped to the organization object in the keys.</Paragraph>
      <Paragraph position="4"> Organizations Of the twenty-three organization entities, AATM7 found twenty-one. It missed &amp;quot;International Technology Underwriters&amp;quot; and &amp;quot;Axa SA.&amp;quot; Two other organizations were typed incorrectly as people, as has been mentioned. Five of the nine organization descriptors were found correctly. The remaining error in the organization area is the result of the breaking of the variation linking mechanism that has been mentioned.</Paragraph>
      <Paragraph position="5"> Artifacts AATM7 correctly identified all three of the artifacts in the walkthrough article; however, because it overgenerated, precision for this object is a low 33%. This was due to the previously discussed mistake in which an organization that owned the satellite was incorrectly identified as the name. In fact, the organizations &amp;quot;Intelsat&amp;quot; and &amp;quot;United States&amp;quot; account for five of the six spurious artifacts. Two of the three descriptors were identified correctly.</Paragraph>
      <Paragraph position="6"> Locations AATM7 correctly identified sixteen of the nineteen locations, but missed &amp;quot;Arlington,&amp;quot; &amp;quot;China,&amp;quot; and the &amp;quot;Central&amp;quot; part of &amp;quot;Central America.&amp;quot; This was due to overzealous context-based filtering.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML