File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/m92-1010_metho.xml
Size: 9,999 bytes
Last Modified: 2025-10-06 14:13:14
<?xml version="1.0" standalone="yes"?> <Paper uid="M92-1010"> <Title>I Priors Tests INCIDENT-TYPE REL-FREg PRESENT STAGE-OF-EXEC _ REL-FREQ PRESENT INSTRUMENT-ID _ EQUI-PROB PRESENT&FREQUENT INSTRUMENT-TYPE REL-FRE9 PRESENT&FREQUENT PERP-INDIV EQUI-PROB PRESENT PERP-ORG _ EQUI-PROB PRESENT PERP-CAT EQUI-PROB PRESENT PERP-CONF EQUI-PROB PRESENT&FREQUENT HUM-TGT-NAME EQUI-PROB PRESENT HUM-TGT-DESCR EQUI-PROS PRESENT HUM-TGT-TYPE REL-FRE9 PRESENT HUM-TGT-EFFECT REL-FREQ PRESENT PHYS-TGT- ID EfUI-PROS PRESENT&FREQUENT PHYS-TGT-TYPE REL-FREQ PRESENT</Title> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> SUMMARY OF MUC-4 PERFORMANC E </SectionTitle> <Paragraph position="0"> Table 1 shows the official template-by-template score results for the Hughes Trainable Text Skimmer use d for MUC-4 (TTS-MUC4) on TST3 . TI'S is a largely statistical system, using a set of Bayesian classifiers with the output of a shallow parser as features. (See the System Summary section of this volume for a detailed description o f The performance, on a slot by slot basis, is, therefore, what one might expect : the pure set fills such as INCIDENT: TYPE and INCIDENT : STAGE OF EXECUTION show much better performance than the string fill s such as HUM TGT: NAME.</Paragraph> <Paragraph position="1"> Table 2 shows the summary rows of the official template-by-template results on TST4 . The complete where fi are textual features. For set fill slots, the Ci are the possible values (e .g. DEATH, SOME DAMAGE , etc.). For the string fill slots, the Ci are yes or no answers to whether a particular item fills a slot, (e .g. HUMAN-TGT-NAME versus HUMAN-TGT-NAME-NOT). For typical Bayesian classifiers, the tunable parameter is th e prior probabilities for the Ci . In TTS-MUC4 we have two different settings, EQUI-PROS and REL-FREQ , respectively for probabilities that are equal for all classes and probabilities that reflect the relative frequency of classes in the training data . EQUI-PROB favors recall, and REL-FREQ favors precision .</Paragraph> <Paragraph position="2"> In addition, for text applications, there is an issue as to whether one includes only those features present i n the text, or, also, those that are absent. In TTS-MUC4 we used two different settings, PRESENT and PRESENT&FREQUENT, where PRESENT&FREQUENT considers all those features which are present and als o those that are absent, but which occur very frequently in the texts . The threshold for whether a feature wa s considered frequent was set so that, for each slot, approximately 30 features were considered frequent . In the TTS-MUC4 conceptual hierarchy there are over 400 potential features .</Paragraph> <Paragraph position="3"> For each slot, the parameter settings were optimized to balance recall and precision . The optimization wa s done using TST1 and TST2 . Table 3 gives the parameter settings for each slot. Balancing precision and recall for string fill slots is difficult in TTS-MUC4 . For example, in the training corpus, TTS-MUC4 detects over 4,000 potential HUMAN-TARGET-NAMES, but less than 10% of these are actual string fills.</Paragraph> </Section> <Section position="3" start_page="0" end_page="105" type="metho"> <SectionTitle> TRAINING METHODOLOG Y </SectionTitle> <Paragraph position="0"> To compute the conditional probabilities, the MUC-3 development (DEV) corpus and the associate d templates where used . Each sentence in the DEV corpus that contained a string fill for some template was used as a training sample . TI'S detects features for important domain words (e.g. explosion, report, etc.), and also for phrases that may map into string fills . For each training sample, the presence or absence of each feature was examined to compute, for example,</Paragraph> </Section> <Section position="4" start_page="105" end_page="105" type="metho"> <SectionTitle> _ EQUI-PROB PRESENT PERP-CAT EQUI-PROB PRESENT PERP-CONF EQUI-PROB PRESENT&FREQUENT HUM-TGT-NAME EQUI-PROB PRESENT HUM-TGT-DESCR EQUI-PROS PRESENT HUM-TGT-TYPE REL-FRE9 PRESENT HUM-TGT-EFFECT REL-FREQ PRESENT PHYS-TGT- ID EfUI-PROS PRESENT&FREQUENT PHYS-TGT-TYPE REL-FREQ PRESENT PHYS-TGT-EFFECT _ REL-FREQ PRESENT </SectionTitle> <Paragraph position="0"> In addition to training of the Bayesian classifiers, the DEV corpus was used, exactly as in TTS-MUC3, t o derive phrase patterns for potential string fills. For example, &quot;SIX JESUITS&quot; would drive the creation of the phrase ( :NUMBER-W : RELIGIOUS-ORDER-W) . The type of the string fill served as the semantic feature for the phrase, which is :CIVILIAN-DESCR, in this example .</Paragraph> <Paragraph position="1"> Improvement that occurred over time in TTS-MUC4 is attributable to two factors: the introduction of the Bayesian classifiers to replace the K-Neighbors technique from TTS-MUC3, and the tuning of the parameters of th eBayesian classifiers for each slot.</Paragraph> <Paragraph position="2"> All of the training for TTS-MUC4 is automated. As with TTS-MUC3, the only manual portion of th e process is choosing the conceptual classes for the lexicon .</Paragraph> </Section> <Section position="5" start_page="105" end_page="105" type="metho"> <SectionTitle> ALLOCATION OF EFFORT </SectionTitle> <Paragraph position="0"> Two calendar months and approximately 2 .5 person months were spent on enhancing the TTS-MUC3 system to create TTS-MUC4.</Paragraph> <Paragraph position="1"> TTS-MUC4 effort falls roughly into three categories : classifier evaluation, system training, and filte r development. Approximately 20% of our time was spent on developing and evaluating the performance of th e Bayesian classifier, and tuning the parameters used in this classifier . This classifier replaced the K-Nearest Neighbor classifier previously employed in TTS-MUC3. 10% of the development effort focused on tuning other system parameters, such as the *fill-strength-threshold*, which provides a means for filtering out unlikely slot fillers . About 40% of our time was devoted to developing filters to improve the precision of the values of the templat e fillers, and evaluating their effects . Retraining of the system to take advantage of a modified lexicon and t o accommodate the revised templates took up about 10% of the time. The remaining 20% of the effort was spent o n developing code to extract information to fill the new and revised slots of the MUC-4 templates .</Paragraph> </Section> <Section position="6" start_page="105" end_page="105" type="metho"> <SectionTitle> LIMITING FACTOR S </SectionTitle> <Paragraph position="0"> One limiting factor for the Hughes TTS-MUC4 system was time. The Bayesian classifier is effective for filling most slots, but the K-Nearest Neighbor classifier might provide better fills for others . However, time did not 10 6 permit us to experiment enough to identify the best classifier to use for each slot . Another aspect of TTS to which we would like to have devoted more attention is on dynamically weighting features retrieved from the knowledge base depending upon their relevance to the slot being processed . Our algorithm for grouping sentences into topics was responsible for many of our errors . Improving the slot-dependent weighting portion of the system would take a considerable amount of additional time, and would require that domain knowledge be added into the processing .</Paragraph> </Section> <Section position="7" start_page="105" end_page="105" type="metho"> <SectionTitle> FUTURE WOR K </SectionTitle> <Paragraph position="0"> The following enhancements are most relevant to the current MUC-oriented software : (1) filters for string fills based on linguistic knowledge, (2) reference resolution, and (3) better learningfpattem classification algorithms .</Paragraph> <Paragraph position="1"> TTS-MUC4 currently has a very limited amount of processing that is specialized for language . One of the feature s that we would have liked to detect in the MUC-4 corpus was the source of information in a story . Individuals who are the source of a report occurred frequently, and er oneously, as human targets . Another &quot;language specific&quot; portio n we would like to add is reference resolution for string fills . TTS-MUC4 currently suffers in its precision score because it lists each referent for a filler several times .</Paragraph> <Paragraph position="2"> Additional changes would make a more usable &quot;real syste m&quot;, although they are not essential for the MUC task as it now stands. These include (1) the development of a user interface for corpus marking, and (2) integratio n with on-line data sources, such as map databases, to eliminate the burden of creating special data files for natura l language processing.</Paragraph> </Section> <Section position="8" start_page="105" end_page="105" type="metho"> <SectionTitle> TRANSFERABILITY TO OTHER TASK S </SectionTitle> <Paragraph position="0"> Currently, TTS only requires a lexicon and a training corpus with templates . Therefore, extension to terrorism in another locale or to a completely different domain would be easy . However, once features are added to improve performance, as noted in Section 6 above, handling a new domain will be more difficult .</Paragraph> </Section> <Section position="9" start_page="105" end_page="105" type="metho"> <SectionTitle> LESSONS LEARNED </SectionTitle> <Paragraph position="0"> TTS-MUC4 represents a small increase in performance beyond TTS-MUC3 . TTS currently has very littl e processing specific to language ; most of the processing is simple feature detection followed, by pattern recognitio n algorithms . We believe that TTS-MUC4 represents a plateau in performance that will require more linguisti c knowledge to increase performance . The goal for TTS, then, is to significantly increase performance withou t increasing development time for new applications .</Paragraph> </Section> class="xml-element"></Paper>