File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/m92-1018_metho.xml

Size: 8,177 bytes

Last Modified: 2025-10-06 14:13:14

<?xml version="1.0" standalone="yes"?>
<Paper uid="M92-1018">
  <Title>SRA SOLOMON : MUC-4 TEST RESULTS AND ANALYSI S</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
RESULTS
</SectionTitle>
    <Paragraph position="0"> Our TST3 and TST4 results are shown in Figures 1 and 2 . The similarity of these scores as well as thei r similarity to SRA-internal testing results reflects the portability of SRA's MUC-4 system . In fact, our scor e on the TST4 texts was better than that of TST3, even though those texts covered a different time perio d than that of the training texts or TST3 .</Paragraph>
    <Paragraph position="1"> Our matched-only precision and recall for both test sets were very high (TST3 : 68/47, TST4: 73/49) .</Paragraph>
    <Paragraph position="2"> When SOLOMON recognized a MUC event, it did a very accurate and complete job at filling the requisit e templates.</Paragraph>
    <Paragraph position="3"> SOLOMON performance was tuned so that the all-templates recall and precision were as close as possibl e to maximize the F-Measure . As shown in Figure 3, our F-Measure steadily increased over time. The fact that this slope has not yet leveled off shows SOLOMON's potential for improvement .</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="137" type="metho">
    <SectionTitle>
EFFORT SPENT
</SectionTitle>
    <Paragraph position="0"> We spent a total of 9 staff months starting January 1, 1992 through May 31, 1992 on MUC-4 . A task-specific breakdown of effort is shown in Figure 4 . The bulk of the work was spent porting SOLOMON t o a new domain with new vocabulary, concepts, template-output format, and fill rules . Approximately 72% of the effort was domain-dependent. However, about 63% of the total effort was language-independent, i.e.</Paragraph>
    <Paragraph position="1"> it would be directly applicable to understanding texts about terrorism in any language. We expect that our English MUC-4 system could be ported to a new language in about 3 months, given a basic grammar , lexicon and preprocessing data similar to the ones which existed for English . We partially demonstrated this</Paragraph>
    <Paragraph position="3"/>
    <Paragraph position="5"> claim by showing our MUC-4 system processing English, Japanese and Spanish newspaper articles about the murder of Jesuit priests at the demonstration session of MUC-4. We spent less than 2 weeks after the final test adding MUC-specific words to Spanish and Japanese lexicons, and extending the grammars of the two languages .</Paragraph>
    <Paragraph position="6"> Data 40% of the total effort building MUC-data was spent on lexicon and KB entry acquisition . Much of this data was acquired automatically. We used the supplied geographical data to automatically build location lexicons and KBs. Using the development templates, we acquired lexical and KB entries for classes of domain term s such as human and physical targets and terrorist organizations . We automatically derived subcategorization information for the domain verbs from the development texts (cf. [1]). These automatically acquired lexicons and KBs did require some manual cleanup and correction .</Paragraph>
    <Paragraph position="7"> Certain multi-word phenomena which occur frequently in texts but are unsuitable for general parsing wer e handled by pattern matching during Preprocessing . For example, we created patterns for Spanish phrases , complex location phrases, relative times, and names of political, military and terrorist organizations . Modifications to SOLOMON's broad-coverage English grammar included adding more semantic restrictions, extending some phrase-structure rules, and improving general robustness .</Paragraph>
    <Paragraph position="8"> Based on our knowledge engineering effort, we built a set of commonsense reasoning rules that are described in detail in our system description. Our EXTRACT module recognizes MUC-relevant events in the output of SOLOMON and translates them into MUC-4 filled templates . We implemented all the domain-specific information as mapping rules or simple conversion functions (e .g. numeric values like &amp;quot;at least 5 &amp;quot; means &amp;quot;5-&amp;quot; ) . This data is stored in the knowledge base, and is completely language independent .  We spent 1 week porting our existing Message Zoner to deal with message headers in MUC messages . The Message Zoner could already recognize more general message structures such as paragraphs and sentences . We extended EXTRACT while maintaining domain and language independence of the module . Features added included event merging and handling of flat MUC templates instead of the more object-oriente d database records that SOLOMON is accustomed to . Our time spent on fixing bugs was distributed throughout the system, but problems in Debris Parsing and Debris Semantics received the most attention .</Paragraph>
  </Section>
  <Section position="5" start_page="137" end_page="137" type="metho">
    <SectionTitle>
SYSTEM TRAININ G
</SectionTitle>
    <Paragraph position="0"> We used TST2 texts for blind testing and the entire 1300 development texts for both testing and trainin g material. The development set was crucial to both our automated data acquisition and our knowledge engineering task . We performed frequent testing to track and direct our progress. To raise recall, w e focussed on data acquisition ; to raise precision, we focussed on stricter definitions of &amp;quot;legal&amp;quot; MUC events . To improve overall performance, we focussed on more robust syntactic and semantic analysis and mor e reliable event merging .</Paragraph>
  </Section>
  <Section position="6" start_page="137" end_page="137" type="metho">
    <SectionTitle>
LIMITING FACTOR S
</SectionTitle>
    <Paragraph position="0"> The two main limiting factors were the number of development texts and templates and the amount of tim e allotted for the MUC-4 effort . With more texts, we could have applied other more data-intensive automate d acquisition techniques and had more examples of phenomena to draw upon . With more time, we would add more domain-dependent lexical knowledge and additional pragmatic inference rules . We also need to tune our EXTRACT mapping rules more finely and improve our discourse module for both NP reference an d event reference resolution. Integration of existing on-line resources such as machine-readable dictionaries , the World Factbook, or WordNet would also improve system performance. A more extensive testing and evaluation strategy at both the blackbox and glassbox levels would help direct progress, but was not feasibl e in the amount of time we had .</Paragraph>
    <Paragraph position="1"> WHAT WAS OR WAS NOT SUCCESSFU L There were several areas where hybrid solutions worked very well. Totally automated knowledge acquisition was quite successful when supplemented by manual checking and editing of domain-crucial information . Similarly, augmenting a pure bottom-up parser with &amp;quot;simulated top-down parsing&amp;quot; (See SRA's MUC-4 System Description) worked well . Improved Debris Semantics and significantly extended Pragmatic Inferencing wer e also important contributors to the system's performance .</Paragraph>
    <Paragraph position="2">  Figure 5 : MUC NLP System Reusability part of SOLOMON 's data and almost all of the processing modules are completely reusable for NLP in othe r domains or languages .</Paragraph>
    <Paragraph position="3"> Currently, our Spanish and Japanese data extraction project MURASAKI is using, without modification , the same processing modules and the core knowledge base as those used for MUC-4 . The MURASAKI system processes Spanish and Japanese language newspaper and journal articles as well as TV transcripts . This project's domain is the AIDS disease. Thus, the only difference between our MUC-4 system an d MURASAKI system is that the latter uses Spanish and Japanese lexicons, patterns and grammars, an d MURASAKI domain-dependent knowledge bases . SOLOMON has also been embedded in several Englis h message understanding systems : ALEXIS (operational) and WARBUCKS.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML