XML Viewer - m91-1006

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/m91-1006_metho.xml
Size: 9,995 bytes
Last Modified: 2025-10-06 14:12:43
<?xml version="1.0" standalone="yes"?>
<Paper uid="M91-1006">
  <Title>BBN PLUM: MUC-3 Test Results and Analysis</Title>
  <Section position="2" start_page="0" end_page="54" type="metho">
    <SectionTitle>
KEY SYSTEM FEATURES
</SectionTitle>
    <Paragraph position="0"> Two design features stand out in our minds: fragment processing and statistical language modelling . By fragment processing we mean that the parser and grammar are designed to find analyses for a non-overlappin g sequence of fragments . When cases of permanent, predictable ambiguity arise, such as a prepositional phrase tha t can be attached in multiple ways or most conjoined phrases, the parser finishes the analysis of the current fragment , and begins the analysis of a new fragment. Therefore, the entities mentioned and some relations between them are found in every sentence, whether syntactically ill-formed, complex, novel, or straightforward . Furthermore, this parsing is done using essentially domain-independent syntactic information .</Paragraph>
    <Paragraph position="1"> The second key feature is the use of statistical algorithms to guide processing . Determining the part of speec h of highly ambiguous words is done by well-known Markov modelling techniques . To improve the recognition of Latin American names, we employed a statistically derived five-gram (five letter) model of words of Spanish origi n and a similar five-gram model of English words . This model was integrated into the part-of-speech tagger.</Paragraph>
    <Paragraph position="2">  Another usage of statistical algorithms was an statistical induction algorithm to learn case frames for verbs fro m examples.</Paragraph>
    <Paragraph position="3"> Major system components are shown in a diagram in the system handout. A more detailed description of th e system components, their individual outputs, and their knowledge bases is presented in a companion pape r [Weischedel, et al ., 1991b]. We expect the particular implementations to change and improve substantially during the next two years of research and development .</Paragraph>
    <Paragraph position="4"> After the message header has been processed, each sentence of the text is processed by linguistic components . Morphological processing includes a probabilistic algorithm for labelling both known and unknown words by part o f speech. The most likely alternatives are passed to the MIT Fast Parser (MITFP), a deterministic parser designed to quickly produce analyses of non-overlapping fragments if no complete syntactic analysis can be found [deMarcken , 1990]. 1 The semantic interpreter finds a semantic analysis for the fragments produced by MITFP . Semantic analysis is shallow in that some analysis must be produced for each fragment even thought most of the words in an article hav e no representation in the domain model . (For instance, Jacobs et al. [1991] estimates that 75% of the words in these texts are not relevant.) The semantic interpreter uses structural rules, almost all created after 25 February 1991 . Nearly all of these carry over to all new domains . Domain-dependent, lexical semantic rules contain traditional cas e frame information . The novel aspect here is that the case frames for verbs were hypothesized by a statistical induction algorithm [Weischedel, et al., 1991a]. Each hypothesized case frame was personally reviewed over a tw o day period.</Paragraph>
    <Paragraph position="5"> The discourse component performs three tasks: hypothesizing relevant events from the diverse descriptions , recognizing co-reference, and hypothesizing values for components of an event . The challenges faced by the discourse component are that syntactic relations present in the text and signifying the role of an entity in a hypothesized event are often not found by MITFP, and that reference resolution must be performed with limite d semantic understanding . Given these challenges, it is clear from the test results that the discourse component does reconstruct event structure well, in spite of missing syntactic and semantic relations .</Paragraph>
    <Paragraph position="6"> The template generator has three tasks : finding and/or merging events hypothesized by discourse processin g into a complete template structure, deciding whether to default the value of template slots not found in the even t structure (e.g, using date and location information in the header), and creating the required template forms . A critical component for future work is a fragment combining algorithm . Based on local syntactic and semanti c information, this algorithm combines fragments to provide more complete analyses of the input [Weischedel, et al ., 1991a]. Though the fragment combining algorithm implemented rules for finding conjoined phrases, 2 prepositional phrase attachment3 , appositive recognition, and correcting errors made by MITFP (e .g., combining adjacent fragments into a single noun phrase), there was not time to add all of the structural semantic rules for the resultin g fragments. Therefore, the combining algorithm was not thoroughly evaluated in MUC-3 .</Paragraph>
  </Section>
  <Section position="3" start_page="54" end_page="55" type="metho">
    <SectionTitle>
OFFICIAL RESULTS
</SectionTitle>
    <Paragraph position="0"> There are several alternative ways to run the algorithms, representing alternative degrees of conservative versu s aggressive hypothesis of templates and slot values . Two alternatives produced a noticeably different tradeof f between recall on the one hand, and precision and recall on the other .</Paragraph>
    <Paragraph position="1">  In the &amp;quot;max tradeoff' version, the one which produced the least difference between recall and precision, a template is produced for an event even if the event has no target nor a date identified the text . The output of the scoring program appears in Figure 2. The overall recall is 42% ; precision is 52%; overgeneration is 22%. A more conservative version produces a template only if a date and target can be found . Furthermore, information pertaining to the event (e.g., time, location, instrument, etc .) must be found within one paragraph of th e phrase(s) designating the event . This is particularly interesting, since at a cost of 3 points in recall, a gain of 6 point s in precision and a cut of one third in overgeneration is achieved.</Paragraph>
  </Section>
  <Section position="4" start_page="55" end_page="56" type="metho">
    <SectionTitle>
EFFORT SPENT
</SectionTitle>
    <Paragraph position="0"> We estimate that roughly seven person months went into our effort . At least half of that was creating algorithms and domain-independent software, since we did not have a complete message processing system prior to this effort .</Paragraph>
    <Paragraph position="1"> About 5% of the effort went to additions to the domain-independent lexicon . Therefore the (roughly) 4 months of person effort for our first domain should not have to be repeated for a new domain .</Paragraph>
    <Paragraph position="2"> The remaining 3 months were spent on domain-specific tasks :  A subset of the development set was used more intensively as training data . Approximately 95,000 words o f text (about 20% of the development corpus) was tagged as to part of speech and labelled as to syntactic structure ; that was part of the DARPA-funded TREEBANK project at the University of Pennsylvania . The bracketed text first provided us with a frequency-ranked list of head verbs, head nouns, and nominal compounds . For each of these we  added a pointer to the domain model element that is the most specific super-concept containing all things denoted by the verb, noun, or nominal compound . As mentioned earlier, the TREEBANK data was then used with the lexica l relation to the domain model to hypothesize case frames for verbs .</Paragraph>
    <Paragraph position="3"> Given a sample of text, we annotate each noun, verb, and proper noun in the sample with the semantic class corresponding to it in the domain model . For instance, dawn would be annotated &lt;time&gt;, explode would be &lt;explosion event&gt;, and Yunguyo would be &lt;city&gt; . We estimate that this semantic annotation proceeded at about 9 0 words/hour.</Paragraph>
    <Paragraph position="4"> From a single example parse tree in TREEBANK, one can clearly infer that bombs can explode, or mor e properly, that bomb can be the logical subject of explode, that at dawn can modify explode, etc. Naturally, good generalizations based on the instances are critical, rather than the instances themselves .</Paragraph>
    <Paragraph position="5"> Since we have a hierarchical domain model, and since the manual semantic annotation states the relationshi p between lexical items and concepts in the domain model, we used the domain model hierarchy as a given set of categories for generalization. However, the critical issue is selecting the right level of generalization given the set o f examples in the supervised training set.</Paragraph>
    <Paragraph position="6"> We extended and generalized a known statistical procedure (Katz, 1987) that selects the minimum level o f generalization such that there is sufficient data in the training set to support discrimination of cases of attachin g phrases (arguments) to their head. This is detailed in [Weischedel, Meteer, Schwartz, 1991].</Paragraph>
    <Paragraph position="7"> The automatically hypothesized verb case frames were then reviewed manually and added to the lexicon . Lastly, the first one hundred messages of the development corpus were used for detailed system debugging , while the 100 messages of TST1 were used as a test set to measure our progress at least once a week . Throughout, we only looked at the summary output from the scoring procedure, rather than adding to the lexicon based on TST 1 or debugging the system based on particular messages.</Paragraph>
    <Paragraph position="8"> Our performance on the hundred messages TST1 is shown in Figure 4 .</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML