File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/m91-1017_metho.xml
Size: 10,426 bytes
Last Modified: 2025-10-06 14:12:43
<?xml version="1.0" standalone="yes"?> <Paper uid="M91-1017"> <Title>UNISYS :MUC-3 TEST RESULTS AND ANALYSI S</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> UNISYS :MUC-3 TEST RESULTS AND ANALYSI S </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="112" type="metho"> <SectionTitle> INTRODUCTIO N </SectionTitle> <Paragraph position="0"> The Unisys MUC-3 system is based on a three-tiered approach to text processing in which a novel an d quite powerful knowledge-based form of information retrieval plays a central role . The main components of this approach are as follows : A Keyword-Based Information Retrieval Component .</Paragraph> <Paragraph position="1"> This component predicts the occurrence of types of events in texts based on the presenc e of key words and phrases .</Paragraph> <Paragraph position="2"> A Knowledge-Based Information Retrieval Component .</Paragraph> <Paragraph position="3"> This component, called KBIRD in the Unisys MUC-3 system, performs the followin g tasks: * Based on the co-occurrence of the predictions made by the keyword-based analysi s component and expressions and concepts discovered in a given text, it predicts th e likely occurrence of additional event types .</Paragraph> <Paragraph position="4"> Although a natural language processing component was included in the design of the Unisys MUC-3 system as a third level of text analysis, not enough time was availabl e during the MUC-3 development cycle both to develop a knowledge-based information retrieval component and to port the Unisys Pundit text-processing system to the MUC 3 terrorist domain . A decision was made to focus on developing the knowledge-base d information retrieval component and postpone the integration of Pundit until MUC-4 . A Template Generation Component .</Paragraph> <Paragraph position="5"> An application-specific Prolog program was written to merge templates describing the same event, and to select the most likely slot values for templates in cases where multiple slot values were proposed .</Paragraph> <Paragraph position="6"> The Unisys MUC-3 development effort was comprised of two full-time Unisys staff members and on e government employee on industrial rotation . A total of 2650 person-hours were put into the project, 80 0 of which were contributed by the government employee . The effort was partially supported by a DARP A grant, which covered approximately 30% of the development cost .' The bulk of the effort involved the development of the KBIRD system and its MUC-3 rule base . These two tasks took approximately the same amount of time, and in total comprised roughly 85% of the effort .</Paragraph> <Paragraph position="7"> 'Work on this project was partially supported by Darpa under contract MDA-903-89-C-0041 .</Paragraph> </Section> <Section position="3" start_page="112" end_page="112" type="metho"> <SectionTitle> TEST RESULT S </SectionTitle> <Paragraph position="0"> The scores reported for the Unisys MUC-3 system are shown in Figure 1 . The low ACT and high MIS scores reported for the template id slot indicate that event detection was a problem . 2 Poor event detectio n performance explains the relatively low recall scores reported for all but the MATCHED ONLY summary measurement. The MATCHED ONLY recall score is a measure of performance in which spurious (fals e positive) and missing (false negative) templates are not factored in . The extremely low SPU score reported for the template id slot suggests that further training of the rule base to improve event detection will no t come at the expense of lower precision scores . In Figure 2, the performance of the Unisys system wit h respect to other MUC-3 systems is indicated in two scatter plots .</Paragraph> <Paragraph position="1"> Since template slot-filling algorithms are triggered by the detection of an event, poor event detectio n performance has a direct negative impact on slot-filling performance . The recall scores for the Unisys MUC-3 system reflect this fact . However, for five slots precision scores are also low . These low precisio n scores are not a consequence of poor event detection, but result instead from a combination of poorl y trained inference rules used to extract the sort of information expressed in the pertinent slots, and bug s in the template generation routines that gather and merge correctly detected information into templat e structures .</Paragraph> </Section> <Section position="4" start_page="112" end_page="114" type="metho"> <SectionTitle> ANALYSIS </SectionTitle> <Paragraph position="0"> Contrary to what the low recall scores that have been reported suggest, the Unisys MUC-3 system ca n perform well at predicting events. The keyword-based prediction of event types is very robust ; the databas e used during this stage of processing was derived from the full 1300 message DEV corpus . Moreover, when the rules used by KBIRD are properly trained, they do a very good job of locating instances of th e events predicted by keyword analysis . Unfortunately, the KBIRD locator rules used to detect instance s of events were trained on a relatively small set of messages--the 200 NOSC DEV and TST1 messages .</Paragraph> <Paragraph position="1"> Consequently, even though the keyword-based analysis phase may have correctly predicted the likel y occurrence of a given event type, KBIRD may not have been able to locate an instance of the predicte d event type . Thus, KBIRD 's locator rules had a negating influence on the performance of the keyword based analysis phase . Prior to the final MUC-3 test, versions of the Unisys system with fewer, mor e 2 The template id slot is scored differently from other slots--the values reported for this slot are a measure of even t detection performance (it doesn ' t make sense to report system performance in generating template ids, since the order i n which templates are generated is not relevant in this task) [2] .</Paragraph> <Paragraph position="2"> taking into consideration false negative and false positive hits (the MATCHED-ONLY score) . The scatter plo t on the right indicates the relative performance of the Unisys MUC-3 system when taking into consideratio n both false negative and false positive hits (the ALL-TEMPLATES score) .</Paragraph> <Paragraph position="3"> general event detection rules in place had recall scores ranging in the high 30's and low 40's for all th e summary measures . A tactical mistake was made in attempting to replace this general rule base with a larger, more context-sensitive one, since there was not enough time to allow the larger rule base to be properly trained . In the evaluation, generating spurious templates tended to have much less of an impac t on scores than failing to generate templates at all . In future evaluations, we will investigate the use o f different locator rule sets as a settable system parameter .</Paragraph> <Paragraph position="4"> Rule training was hindered during the MUC-3 development cycle by the need to concurrently build the component that would be using the rules . In addition to this development problem, technical difficulties in KBIRD's design began to appear once the number of rules had grown to a realistic size . These technical problems resulted in slow message processing speeds, which further complicated the rule training process .</Paragraph> <Paragraph position="5"> The following three key problems were identified : Heavy use of forward-chaining .</Paragraph> <Paragraph position="6"> There is currently too much reliance on forward-chaining in the KBIRD system . Many KBIRD reasoning tasks could be more efficiently achieved in a backward-chaining fashion .</Paragraph> <Paragraph position="7"> Expensive TMS system .</Paragraph> <Paragraph position="8"> KBIRD was built on top of a very general inferencing mechanism with an expensiv e TMS system. KBIRD's needs for truth maintenance could be accomodated using a much simpler TMS component .</Paragraph> <Paragraph position="9"> Inability to focus search .</Paragraph> <Paragraph position="10"> In KBIRD, it is currently not possible to focus search on a specific region of text . The mechanism used to satisfy a rule looks for all chart elements (concepts, words, phrases , and so forth) that match constituent expressions in the antecedent of a rule . If the KBIRD rule specifies that an element of a certain type must be in the same sentence as some other element, it would be more efficient to limit the search space to just those chart element s that fall within the span of the sentence. However, KBIRD's algorithm currently searche s through chart elements indexed to locations anywhere in the text for suitable candidates .</Paragraph> </Section> <Section position="5" start_page="114" end_page="114" type="metho"> <SectionTitle> CONCLUDING REMARK S </SectionTitle> <Paragraph position="0"> The time constraints imposed in MUC-3 made it impossible to fully develop the Unisys MUC-3 system' s knowledge-based information retrieval component, KBIRD, before the evaluation deadline . Consequently, it is not possible at this time to establish the capabilities of the three-tiered approach realized in th e system . The system's scores indicate, however, that although the rules for locating instances of event s were inadequately trained, its performance at identifying slot values once an instance has been found i s quite good .</Paragraph> <Paragraph position="1"> Future work on the system will solve the technical problems that have been observed . This will be achieved by performing the following tasks : * The overall system flow will be restructured to allow backward-chaining to handle more of the processing load.</Paragraph> <Paragraph position="2"> * The current forward-chaining mechanism will be reimplemented so that it is specifically geared to the processing tasks envisoned for KBIRD .</Paragraph> <Paragraph position="3"> * Subject to an appropriate funding source, the KBIRD locator rules used to detect instances o f predicted event types will be properly trained .</Paragraph> <Paragraph position="4"> In addition to solving the technical problems that have arisen in the system's KBIRD component, a major effort will be made to incorporate the Unisys Pundit NLP system into the MUC-3 system .</Paragraph> </Section> class="xml-element"></Paper>