XML Viewer - h91-1066

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/h91-1066_metho.xml
Size: 23,278 bytes
Last Modified: 2025-10-06 14:12:41
<?xml version="1.0" standalone="yes"?>
<Paper uid="H91-1066">
  <Title>LEXICO-SEMANTIC PATTERN MATCHING AS A COMPANION TO PARSING IN TEXT UNDERSTANDING</Title>
  <Section position="5" start_page="0" end_page="337" type="metho">
    <SectionTitle>
SIX PEOPLE WERE KILLED AND FIVE WOUNDED
TODAY IN A BOMB ATTACK THAT DESTROYED
A PEASANT HOME IN THE TOWN OF QUINCHIA,
ABOUT 300 KM WEST OF BOGOTA, IN THE
COFFEE-GROWING DEPARTMENT OF RISAR-
ALDA, QUINCHIA MAYOR SAUL BOTERO HAS
</SectionTitle>
    <Paragraph position="0"> REPORTED. (41 words) Segmented text: \[SiX PEOPLE\] \[h: WERE KILLED\] AND FIVE \[A: WOUNDED\] \[TIME: TODAY\] IN \[A: A BOMB ATTACK\] THAT \[h: DESTROYED\] \[i PEASANT HOME\] \[LOCATION: IN THE TOWN OF QUINCHIA\] \[DISTANCE: *COMMA* ABOUT 300 KM WEST OF BOGOTA\] \[LOCATION: *COMMA* IN THE 1MUC-3, the third government-sponsored message understanding evaluation, is in progress. Later in this paper, we will discuss the task and performance on the task.</Paragraph>
  </Section>
  <Section position="6" start_page="337" end_page="337" type="metho">
    <SectionTitle>
COFFEE *HYPHEN* GROWING DEPARTMENT
OF RISARALDA\] \[SOURCE: *COMMA* QUINCHIA
MAYOR SAUL BOTERO HAS REPORTED\] *PE-
RIOD*
</SectionTitle>
    <Paragraph position="0"> The label A in some segments indicates that those segments are template activators for a single event (single events are generally the default for multiple references within a sentence, unless there is a specific contextual cue such as a shift of time or location).</Paragraph>
    <Paragraph position="1"> The other labels are names of possible roles in templates. As is typical in news stories, roles can be shared (like time or location) or can apply to a single sub-event (like the number killed and wounded).</Paragraph>
    <Paragraph position="2"> By grouping and labeling portions of text early, the program greatly reduces the amount of real parsing that must be done, eliminates many failed parses, and provides template-filllng information that helps with later processing. For example, the phrase IN THE TOWN OF QUINCHIA is at least five ways ambiguous--it could modify A PEASANT HOME, DESTROYED, A BOMB ATTACK, WOUNDED, or WERE KILLED AND FIVE \[WERE\] WOUNDED. However, all five of these possibilities have the same effect on the final templates produced, so the program can defer any decisions about how to parse these phrases until after it has determined that the killing, wounding, attacking, and destruction are all part of the same event. Since these choices combine with the ambiguity of other phrases, the parsing process would otherwise be needlessly combinatoric. In fact, parsing contributes nothing after A PEASANT HOME, so this sentence can be processed as a 16-word example with some extra modifiers.</Paragraph>
    <Paragraph position="3"> In addition to reducing the combinatorics of modifier attachment, pre-processing helps in resolving false ambiguities that are a matter of style in this sort of text. In this example, the ellipsis in FIVE \[WERE\] WOUNDED would be difficult, except that WOUNDED, llke many transitive verbs, is never used as an active verb without a direct object. The ellipsis is thus detected prior to parsing, to be resolved during parsing rather than as paxt of recovering or detecting a syntactic gap. The early bracketing of the text allows the parser to resolve these complexities and ambiguities without much extra baggage, and without having to wait for a complete verb phrase.</Paragraph>
    <Paragraph position="4"> Pre-processing not only speeds up parsing by avoiding combinatorics; it also improves the accuracy of interpretation, both by avoiding failures and by recognizing phrases and constructions that have specialized meaning or syntactic properties. The next section describes the design of a lexicon-driven pattern matcher that performs this sort of analysis prior to parsing, and the rest of the paper will present several types of examples where pre-processing serves to improve parsing.</Paragraph>
  </Section>
  <Section position="7" start_page="337" end_page="339" type="metho">
    <SectionTitle>
LEXICO-SEMANTIC PATTERN
MATCHING
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="337" end_page="339" type="sub_section">
      <SectionTitle>
The Pattern Language
</SectionTitle>
      <Paragraph position="0"> Because the pattern matcher is designed as an efficient &amp;quot;trigger&amp;quot; mechanism and an aid in parsing, the patterns are mostly simple combinations of lexical categories. The patterns largely adopt the language of regular expressions, including the following  terms and operators: * Lexical features that can be tested in a pattern: - token &amp;quot;name&amp;quot; (e.g. &amp;quot;AK-4T') - lexical category (e.g. &amp;quot;adj&amp;quot;) - root (e.g. &amp;quot;shoot&amp;quot;) - conceptual category (e.g. &amp;quot;human&amp;quot;) * Logical combination of lexical feature tests - OR, AND , and NOT  For the MUC-3 corpus, the knowledge base of patterns thus far contains about 150 rules, where each rule contains a pattern with an action (such as tagging, bracketing, deleting, adding, or otherwise enhancing the &amp;quot;tokenized&amp;quot; input to help the parser). The rules range from mundane combinations of words to intricate stylistic expressions. Below, we will go through some examples of some of these rules, and the next section will characterize their capabilities in more general terms. This is work in progress, so we will discuss both the current implementation and the directions for further work.</Paragraph>
      <Paragraph position="1"> The strategy for pre-processing, as with parsing, is to process the text in stages, starting with coarse topic analysis and filtering, then moving on to tagging, segmentation, and template activation. Among the useful side benefits of the pattern matcher is that it discards portions of text that do not activate (or support) any templates. In MUC-3, this process eliminates about 75% of the input. On the first test set, the program did not skip any texts that contained relevant templates.</Paragraph>
      <Paragraph position="2"> Because of this multi-stage design, the first stage of pattern matching contains the simplest patterns, and these include mostly expanded morphological forms, to avoid even the morphological analysis of large portions of irrelevant text. Below are three examples of these activator rules:  In addition to providing a rough screen of the input, these coarse template activation patterns &amp;quot;mark up&amp;quot; the text. Variable assignments effectively tag portions of text to help the parser. For example, the PIVOT tag tells the parser to favor a particular lexical term for the head of linguistic attachments, and the 0BJ tag tells the semantic interpreter to try to fill a conceptual object role for a constituent. Since these patterns perform only the crudest form of linguistic analysis, their purpose is not to replace parsing but to allow the parser to focus its processing and not &amp;quot;prune off&amp;quot; paths that are likely to be critical.</Paragraph>
      <Paragraph position="3"> Rule 11 above handles inputs such as The attack left 9 people dead. Rule 40 handles, for example, The dynamite charge partially destroyed the bank facilities.</Paragraph>
      <Paragraph position="4"> The macros on the right hand sides of rules, such as markactivator, generally use the results of the pattern match, including variable assignments, along with some other constants, such as murder and d-vp, to tag and segment the text. Template activation tags, like murder, allow the semantic interpreter to frill slots and apply constraints from the appropriate template during parsing. Grammatical tags, like d-vp (the double-object verb phrase, including adjectival complements) give a preferred parse, so the parser can try to favor a parse consistent with the lexlcosemantic pattern.</Paragraph>
      <Paragraph position="5"> The second set of rules, after the initial filtering and triggering, performs the cleanup of the input text, including many names, dates, punctuation, and marking of locative and temporal phrases. These rules can be somewhat more involved, as in the following examples:</Paragraph>
      <Paragraph position="7"> Rule 97 helps to distinguish appositive phrases from fists, relative clauses, and other constructions with internal punctuation. The parser handles many punctuated forms using grammar rules or meta-rules, but these can qnicldy get out of control. A simple example is He is in charge of the investigations of the deaths of Guillermo Cano, director of the newspaper El Espectador, and Jaime Pardo Leal, the president of the Patriotic Union.</Paragraph>
      <Paragraph position="8"> Rule 113 catches many locative expressions.</Paragraph>
      <Paragraph position="9"> The most complex patterns perform tagging and segmentation of grammatical constructions. While these are probably the most interesting and promising for the general control of parsing, we have only begun to encode them. The following are two examples:  injury, as in Six people were killed and five wounded, and rule 127 segments examples where the verb leave is used to express death and injury, as in left 6 people dead. These rules often overlap, as rule 128 overlaps with rule 11. The motivation for this is that rule 11 simply spots certain cases where left is used to express death (a very smaU percentage of occurrences of left), while the more powerful rule, 128, tries to segment the objects and effects. The Algorithm When the system loads the pattern-activation rules, it indexes each pattern by the lexical features (i.e. the words, lexical categories, roots and concepts) of each of its constituents, distinguishing those that require lexical analysis from the word-only rules. At rim-time, the pattern matcher performs the following four operations:  1. It examines each input token (only) once for any features that index pattern tests.</Paragraph>
      <Paragraph position="10"> 2. Each satisfied pattern test &amp;quot;triggers&amp;quot; its enveloping rule. The satisfied pattern tests are cached so subsequent occurrences of the same input token avoid the feature examination. null 3. After all input tokens have been examined, the program matches all triggered rules (those that have all of their non-optional tests satisfied) against the input. The matching uses a best-first search algorithm, where the &amp;quot;best&amp;quot; match is one that uses the most pattern constituents and the most  input tokens. This matching process is implemented as a OQOELI, 45 .... table traversal.</Paragraph>
      <Paragraph position="11"> 4. The system executes the actions of all matched rules.</Paragraph>
      <Paragraph position="12"> We now turn to how this simple form of pre-processing helps parsing and how it is likely to influence future advances in text interpretation.</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="339" end_page="339" type="metho">
    <SectionTitle>
FEATURES OF PRE-PROCESSING
</SectionTitle>
    <Paragraph position="0"> This section gives some examples from news stories of the places where pattern matching eliminates or assists with work typically left for parsing. Pushing these tasks into this pre-processing phase with a less computation-intensive mechanism speeds up language analysis, reduces the complexity of the input texts, allows for modularity between topic analysis and data extraction, and increases the accuracy of the resulting analysis.</Paragraph>
    <Paragraph position="1"> Pattern matching performs the following tasks:  1. Name recognition and reduction: Person names may contain long and complex titles and appositives, as in the following examples:</Paragraph>
  </Section>
  <Section position="9" start_page="339" end_page="339" type="metho">
    <SectionTitle>
FORMER PERUVIAW DEFEISE MINISTER
GENERAL EBRIqUE LOPEZ ALBUJAR
FARIO SOLORZhIIO NARTIIEZ, LEADER OF
GUATEMALA'S DEMOCRATIC SOCIALIST
</SectionTitle>
    <Paragraph position="0"> PARTY,...</Paragraph>
    <Paragraph position="1"> We recognize these constructs with the pattern matcher, using patterns that contain variables for first names and variables for titles.</Paragraph>
    <Paragraph position="2"> 2. Spatial phrase recognition and reduction: Pre-processing can easily identify and compress many locatives, using patterns that look for combinations of spatial prepositions with known locations, as in the following:</Paragraph>
  </Section>
  <Section position="10" start_page="339" end_page="339" type="metho">
    <SectionTitle>
II THE TOWI OF QUIICHIA, ABOUT 300 KM
WEST OF BOGOTA, II TIIE COFFEE-GROWlIiG
DEPARTNEIJT OF RISARALDA ...
</SectionTitle>
    <Paragraph position="0"> 3. Temporal phrase recognition and reduction: The pattern matcher picks out many temporal adverbial phrases, such as: II THE PAST FEW HOURS MORE TilhB 3 MOBTIIS AGO.</Paragraph>
    <Paragraph position="1"> 4. &amp;quot;Cleanup&amp;quot; of news style text: Patterns capture and help interpret style-specific constructions, as in the following examples: null WITH LICEBSE PLATE UF-2171 5. Tagging and segmentation: Some complex constructs, especially ellipsis and conjunction, are easier to identify and parse with identification at pre-processing, for example, the construct &lt; event&gt;... &lt; verb-leave&gt;... Y &lt; heal~h-atate&gt; and...Z &lt;health_state&gt; in the following sentence: THE UPRISING, WHICH BEGAN AT 1100 (1700 GMT) ON 26 MARCH AND WHICH INCLUDES DEMANDS FOR BETTER JAIL CONDITIONS, HAS LEFT AT LEAST 12 DEAD AND SOME 20 INJURED, ACCORDING TO POLICE SPOKESMEN. null 6. Topic analysis and filtering: Patterns for topical keywords  and phrases help to perform topic analysis and filtering of stories. For example, stories containing patterns like &lt;attack&gt; ...&lt;civilian&gt; are likely to be about terrorist attacks. This type of relevance determination is useful at the paragraph and sentence level as well. Eliminating irrelevant sentences saves the language analysis programs from having to parse them, and often avoids &amp;quot;false positives&amp;quot; by eliminating background information from interpretation, as in the following paragraph:</Paragraph>
  </Section>
  <Section position="11" start_page="339" end_page="339" type="metho">
    <SectionTitle>
ALL OF THESE CHARACTERISTICS MAKE HONDURAS
A DEMOCRACY, AND EVERY SECTOR OF HONDURAN
SOCIETY SHOULD STRIVE TO STRENGTHEN THEM BY
AVOIDING VIOLENCE AT ALL COSTS, OBEYING THE LAW,
AND CONDEMNING AND ATTACKING EXTREME TER-
RORIST GROUPS REGARDLESS OF THEIR AFFILIATION,
THUS ENABLING US TO CONSOLIDATE OUR DEMOC-
RACY AND REACH THE LEVEL OF POLITICAL DEVEL-
OPMENT ATTAINED BY OTHER DEMOCRATIC COUN-
</SectionTitle>
    <Paragraph position="0"> TRIES.</Paragraph>
    <Paragraph position="1"> These six examples illustrate some of the places where fairly well-understood techniques from Artificial Intelligence, combined with a large lexical and conceptual hierarchy, are very useful in analyzing texts for natural language data extraction. In some cases, such as topic analysis, the pattern matcher operates as a separable component from the rest of the text processing system; in others, Hke syntactic segmentation and spatial/temporal reduction, it is more closely coupled with the parser. This approach has many of the advantages of phrasal parsing, such as robust coverage of a range of grammaticM constructions, the elimination of grammatical complexity, and the easy adaptation of the system to handle sublanguage constructs. But it retains the advantages of parsing for handling agreement, attachment, and semantic interpretation of the text.</Paragraph>
    <Paragraph position="2"> The next section compares this style of processing with earlier work in phrasal parsing.</Paragraph>
  </Section>
  <Section position="12" start_page="339" end_page="340" type="metho">
    <SectionTitle>
PRE-PROCESSING AND PHRASAL
PARSING
</SectionTitle>
    <Paragraph position="0"> We have pointed out some of the problems with traditional left-to-right single pass parsing methods, including the lack of influence of global context on local interpretation, the complexity  of long sentences, stylized constructs, and garden paths. These well-known symptoms of syntactic parsing point to two general means of improving control---sublanguage analysis \[2\] and domain-driven or conceptual analysis \[3\]. Roughly speaking, this means that the system must constrain the input through either linguistic or semantic methods wherever possible. Pattern matching is an effective vehicle to enforce such constraints.</Paragraph>
    <Paragraph position="1"> In some ways, the use of pattern matching for pre-processing is reminiscent of phrasal styles of language analysis \[4, 5, 6\], which in turn derive in part from semantic grammars \[7\]. These controlled styles of parsing were especially useful for engineering applications in limited domains, where it is much easier to cover the range of meaningful expressions and their interpretations than to control left-to-right parsing and subsequent semantic interpretation. However, phrasal analysis, like syntax-first parsing, tried to treat most of language interpretation within a single-pass, single-strategy process. This confined the approach to fairly simple applications and made it difficult to port from one domain to another.</Paragraph>
    <Paragraph position="2"> In addition to this brittleness and limited scope of phrasal parsing, the phrasal approach suffered from a more fundamental problem: treating phrases or constructions as a replacement for grammatical rules seemed to miss the point of grammar entirely, leaving no place to account for most of the regularities of language. Even most of the rigid constructions and &amp;quot;idioms&amp;quot; of a language (like riddled with bullets) are grammatical. Thus the encoding of most of the knowledge about these expressions was really redundant, forcing the phrasal analyzer to apply interpretation rules and enforce constraints that easily could have been expressed in more general terms. This causes problems both in developing broad coverage and in applying automated methods of acquiring phrasal knowledge.</Paragraph>
    <Paragraph position="3"> Lexico-semantic pre-processing, by introducing domain constraints and linguistic constrncts prior to processing, controls parsing through two vehicles: (1) Triggering grammatical constrncts prior to parsing allows the parser to apply the same grammatical knowledge to many different types of input without being constantly led to garden paths or false interpretations, and (2} Using filtering and template activation to capture some domain knowledge prior to parsing allows the parser to direct attachment and pruning toward the production of relations that affect the domain result. This sort of multi-stage analysis seems to be the right style for accomplishing the directed processing of the phrasal and sublanguage approaches while allowing for the breadth and portability that current text processing applications require.</Paragraph>
  </Section>
  <Section position="13" start_page="340" end_page="341" type="metho">
    <SectionTitle>
CURRENT STATUS AND FUTURE
ENHANCEMENTS
</SectionTitle>
    <Paragraph position="0"> Our system, known as the GE NLToolset \[8\], is one of the more complete and mature text interpretation programs, having developed from a substantial research thrust into several applications outside of the research laboratory. Like other researchers in text interpretation, we have come to evaluate this sort of work in part through the system's performance on government-sponsored benchmark evaluations.</Paragraph>
    <Paragraph position="1"> The second message understanding evaluation conference, in 1989, known as MUCK-II \[9\], used a corpus of slightly over 100 naval operations reports, with a final test on 5 such messages.</Paragraph>
    <Paragraph position="2"> The current MUC-3 development corpus contains 1300 open-source foreign news stories, with a final test on 100 such stories. The total corpus for MUC-3 is about 400,000 words, compared with about 3200 for MUCK-II. The MUC-3 task also requires both broader and deeper analysis of the texts, with an unconstrained range of responses. For example, the example sentence about the bombing in Risaralda would produce the template illustrated in Figure I.</Paragraph>
    <Paragraph position="3"> The MUC-3 evaluation scores each system on its ability to match a &amp;quot;correct&amp;quot; set of over 100 filled templates on 100 news stories.</Paragraph>
    <Paragraph position="4"> The scale-up over the two years from MUCK-II to MUG-3 has strained parsing systems in throughput, coverage, and accuracy, and pre-processing has been essential to all three. Our system throughput is now an order of magnitude greater in words per minute (about 1000/minnte) than in MUCK-II, the coverage is orders of magnitude greater, and the accuracy is about the same as at this point in MUCK-II, in spite of a harsher scoring system. Improvements in the grammar, lexicon, preference module, and recovery strategies have helped in this advance. However, large improvements in parsing are hard to come by, hence the incremental contribution of pre-processing is disproportionate, given the simplicity of the algorithm and rules.</Paragraph>
    <Paragraph position="5"> Two major challenges remain in integrating the pattern matcher more effectively with the parser, and both should be accomplished, at least in part, before the end of MUC-3. We view the apparent success of simple pattern matching methods not as a replacement for real parsing, but rather as an example of how much work is involved in controlling parsing of texts. The current coupling of the parser with the pattern matcher is not sufficiently fluid to take advantage of much of the information that the pattern matcher can produce, leaving room for further integration.</Paragraph>
    <Paragraph position="6"> The first apparent challenge is to tie the linguistic patterns, where appropriate, to &amp;quot;top-down&amp;quot; domain knowledge. In many cases, the common expressions, forms, and preferences derive from conceptual relationships in the domain; for example, the leave dead expressions are part of a general class of descriptions that follow events with the effects of events. For efficiency, the pattern matcher must recognize these descriptions at the lexical level, but there is no reason why domain knowledge cannot help to collect and create such lexical patterns.</Paragraph>
    <Paragraph position="7"> The second challenge is to use the results of pattern matching more for parser preferences, using a strategy we call relation-driven control. This strategy looks for attachments of phrases to pivots which appear at the head of template activators. We have already implemented relation-driven control as a means of recovering from failed parses, but much of the task of nsing pivots and brackets to guide preferences remains.</Paragraph>
    <Paragraph position="8"> In addition to these two challenges, another task, which is more difficult than it would seem, is to combine pattern matching with other, more syntactic methods of pre-processing, such as stochastic analysis or finite-state recognition of constituents.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML