File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/a92-1026_metho.xml

Size: 27,609 bytes

Last Modified: 2025-10-06 14:12:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="A92-1026">
  <Title>Robust Processing of Real-World Natural-Language Texts</Title>
  <Section position="3" start_page="186" end_page="189" type="metho">
    <SectionTitle>
2 Syntactic Analysis
</SectionTitle>
    <Paragraph position="0"> Robust syntactic analysis requires a very large grammar and means for dealing with sentences that do not  parse, whether because they fall outside the coverage of the grammar or because they are too long for the parser. The gral-nnaar used in TACITUS is that of the I)IAI~OCIC system, deweloped in 1980-81 essentially by constructing the union of the Linguistic String Project G'ranmmr (Sager, 1981) and tile DIAGP~AM grammar (Robinson, 1982) which grew out of SRI's Speech Un&amp;:rst.anding System research in the 1970s. Since that t.imc il. has been consid~'l'ably enhanced. It consists of about 160 phrase structure rules. Associated with each rule is a &amp;quot;constructor&amp;quot; expressing the constraints on the applicability of that rule, and a &amp;quot;translator&amp;quot; for producing the logical form.</Paragraph>
    <Paragraph position="1"> The grannnar is comprehensive and includes subcategorization, sentential complements, adverbials, relative clauses, complex determiners, the most common varieties of conjnnction and comparisou, selectional constraints, some coreference resolution, and the most common sentence fra.gments. The parses are ordered according to heuristics encoded in the grammar.</Paragraph>
    <Paragraph position="2"> The parse tree is translated into a logical representation of the nleaning of the sentence, encoding predicate-argument relations and grammatical subordination relations. In addition, it regularizes to some extent the role assignments in the predicate-argument structure, and handles argnments inherited from control verbs.</Paragraph>
    <Paragraph position="3"> Our lexicon includes about 20,000 entries, including about 2000 personal names and about 2000 location, organization, or other names. This number does not include morphological variants, which are handled in a separate naorphological analyzer. (In addition, there are special procedures for handling unknown words, including unknown names, described in Hobbs et al., 1991.) The syntactic analysis component was remarkably successful in the MUC,-3 evaluation. This was due primarily to three innovations.</Paragraph>
    <Paragraph position="4">  Each of these techniques will be described in turn, with statistics on their i)erformance in the MUC-a evaluation.</Paragraph>
    <Section position="1" start_page="187" end_page="188" type="sub_section">
      <SectionTitle>
2.1 Performance of the Scheduling Parser and
the Grammar
</SectionTitle>
      <Paragraph position="0"> Tile fastest parsing algorithms for context-free grammars make use of prediction based on left context to limit the nnmber of nodes and edges the parser must insert into tim chart. However, if robustness in the face of possibly ungramlnatical input or inadequate grammatical coverage is desired, such algorithms are inappropriate.</Paragraph>
      <Paragraph position="1"> Although the heuristic of choosing tile longest possible substring beginning at the left, that can be parsed as a sentence could be tried (e.g. Grishman and Sterling, 1989), solnetimes, the best fraglnentary analysis of a sentence can only be found by parsing an intermediate or terminal substring that excludes the leftmost words.</Paragraph>
      <Paragraph position="2"> For this reason, we feel that bottom-up parsing without strong constraints based on left context, are required for robust syntactic analysis.</Paragraph>
      <Paragraph position="3"> Bottom-up parsing is favored for its robustness, and this robustness derives from the fact that a bottom-up parser will construct nodes and edges in the chart that a parser with top-down prediction would not. The obvious problem is that these additional nodes do not come without an associated cost. Moore and Dowding (1991) observed a ninefold increase ill time required to parse sentences with a straightforward C, KY parser as opposed to a shift-reduce parser. Prior to November 1990, TACITUS employed a simple, exhaustive, bottom-up parser with the result that sentences of more than 15 to 20 words were impossible to parse in reasonable time. Since the average length of a sentence in the MUC-3 texts is approximately 25 words, such techniqnes were clearly inappropriate for the application.</Paragraph>
      <Paragraph position="4"> We addressed this problem by adding an agenda mechanism to the bottom-up parser, based on Kaplan (1973), as described in Winograd (1983). The purpose of the agenda is to allow us to order nodes (complete constituents) and edges (incomplete constituents) in the chart for further processing. As nodes and edges are built, they are rated according to various criteria for how likely they are to figure in a correct parse. This allows us to schedule which constituents to work with first so that we can pursue only the most likely paths in the search space and find a parse without exhaustively trying all possibilities. The scheduling algorithm is simple: explore the ramifications of the highest scoring constituents first.</Paragraph>
      <Paragraph position="5"> In addition, there is a facility for pruning the search space. The user can set limits on the number of nodes and edges that are allowed to be stored in the chart.</Paragraph>
      <Paragraph position="6"> Nodes are indexed on their atomic grammatical category (i.e., excluding features) and the string position at which they begin. Edges are indexed on their atomic grammatical category and tim string position where they end. The algorithm for pruning is simple: Throw away all but the n highest scoring constituents for each category/string-position pair.</Paragraph>
      <Paragraph position="7"> It has often been pointed out that various standard parsing strategies correspond to various scheduling strategies in an agenda-based parser. However, in practical parsing, what is needed is a scheduling strategy that enables us to pursue only the most likely paths in the search space and to find the correct parse without exhaustively trying all possibilities. The literature has not been as ilhnninating on this issue.</Paragraph>
      <Paragraph position="8"> We designed our parser to score each node and edge on the basis of three criteria: * The length of the substring spanned by the constituent. null * Whether the constituent is a node or an edge, that is, whether the constituent is complete or not.</Paragraph>
      <Paragraph position="9"> * The scores derived from the preference heuristics that have been encoded in DIALOGIC over the years, described and systematized in Hobbs and Bear (1990).</Paragraph>
      <Paragraph position="10"> However, after considerable experimentation with var- null ious weightings, we concluded that tile length and completeness factors failed to improve the performance a.t all over a broad range of sentences. Evidence suggested that a score based on preference factor alone produces the best results. The reason a correct or nearly correct parse is found so often by this method is that these preference heuristics are so effective.</Paragraph>
      <Paragraph position="11"> In the frst 20 messages of the test set., 131 sentences were given to the scheduling parser, after statistically based relevance filtering. A parse was produced for 81 of the 131 sentences, or 62%. Of these, 4:3 (or 33%) were completely correct, and 30 more had three or fewer errors. Thus, 56% of the sentences were parsed correctly or nearly correctly.</Paragraph>
      <Paragraph position="12"> These results naturally vary depending oil the length of the sentences. There were 64 sentences of under 30 naorphemes (where by &amp;quot;morpheme&amp;quot; we mean words plus inflectional affixes). Of these, 37 (58%) had completely correct parses and 48 (75%) had three or fewer errors.</Paragraph>
      <Paragraph position="13"> By contrast, the scheduling parser attempted only 8 sentences of more than 50 morphemes, and only two of these parsed, neither of them even nearly correctly.</Paragraph>
      <Paragraph position="14"> Of the 44 sentences that would not parse, nine were due to problems in lexical entries. Eighteen were due to shortcomings in the grammar, primarily involving adverbial placement and less than fully general treatment of conjunction and comparatives. Six were due to garbled text. The causes of eleven failures to parse have not been determined. These errors are spread out evenly across sentence lengths. In addition, seven sentences of over 30 lnorphemes hit the time limit we had set, and terminal substring parsing, as described below, was invoked.</Paragraph>
      <Paragraph position="15"> A majority of the errors in parsing can be attributed to five or six causes. Two prominent causes are the tendency of the scheduling parser to lose favored close attachments of conjuncts and adjuncts near the end of long sentences, and the tendency to misanalyze the string  again contrary to the grammar's preference heuristics.</Paragraph>
      <Paragraph position="16"> We believe that most of these problems are due to the fact that the work of the scheduling parser is not distributed evenly enough across the different parts of the sentence, and we expect that this difficulty could be solved with relatively little effort.</Paragraph>
      <Paragraph position="17"> Our results in syntactic analysis are quite encouraging since they show that a high proportion of a corpus of long and very complex sentences can be parsed nearly correctly. However, the situation is even better when one considers the results for the best-fragment-sequence heuristic and for terminal substring parsing.</Paragraph>
    </Section>
    <Section position="2" start_page="188" end_page="188" type="sub_section">
      <SectionTitle>
2.2 Recovery from Failed Parses
</SectionTitle>
      <Paragraph position="0"> When a sentence does not parse, we attempt to span it with the longest, best sequence of interpretable fragments. The fragments we look for are main clauses, verh phrases, adverbial phrases, and noun phrases. They are chosen on the basis of length and their preference scores, favoring length over preference score. We do not attempt to find fragments for strings of less than five morphemes.</Paragraph>
      <Paragraph position="1"> The effect of this heuristic is that even for sentences that do not parse, we are able to extract nearly all of the propositional content.</Paragraph>
      <Paragraph position="2"> For example, the sentence The attacks today come afl.er Shining Path attacks during which least 10 buses were burned throughout Lima on 24 Oct.</Paragraph>
      <Paragraph position="3"> did not parse because of the use of &amp;quot;least&amp;quot; instead of &amp;quot;a.t. least&amp;quot;. Hence, the best. Dagment sequence was sought. This consisted of the two fragments &amp;quot;The attacks today come after Shining Path attacks&amp;quot; and &amp;quot;10 buses were burned thronghout Lima on 24 Oct.&amp;quot; The parses for both these fragments were completely correct. Thus, the only information lost was from the three words &amp;quot;during which least&amp;quot;. Frequently such information can be recaptured by the pragmatics component. In this case, the burning would be recognized as a consequence of the attack. null In tile first 20 messages of the test set, a best sequence of fragments was sought for the 44 sentences that did not parse for reasons other than timing. A sequence was found for 41 of these; the other three were too short, with problems in the middle. The average number of fragments in a sequence was two. This means that an average of only one structural relationship was lost. Moreover, the fragments covered 88% of the morphemes. That is, even in the case of failed parses, 88% of the propositional content of the sentences was made available to pragmatics. Frequently the lost propositional content is from a preposed or postposed, temporal or causal adverbial, and the actual temporal or causal relationship is replaced by simple logical conjunction of the fragments.</Paragraph>
      <Paragraph position="4"> In such cases, much useful information is still obtained fl'om the partial results.</Paragraph>
      <Paragraph position="5"> For .37% of the 41 sentences, correct syntactic analyses of the fragments were produced. For 74%, the analyses contained three or fewer errors. Correctness did not correlate with length of sentence.</Paragraph>
      <Paragraph position="6"> These numbers could probably be improved. We favored the longest fragment regardless of preference scores. Thus, frequently a high-scoring main clause was rejected because by tacking a noun onto the front of that fragment and reinterpreting the main clause bizarrely as a relative clause, we could form a low-scoring noun phrase that was one word longer. We therefore plan to experiment with combining length and preference score in a more intelligent manner.</Paragraph>
    </Section>
    <Section position="3" start_page="188" end_page="189" type="sub_section">
      <SectionTitle>
2.3 Terminal Substring Parsing
</SectionTitle>
      <Paragraph position="0"> For sentences of longer than 60 words and for faster, though less accurate, parsing of shorter sentences, we developed a technique we are calling lerminal subsiring parsing. The sentence is segmented into substrings, by breaking it at commas, conjunctions, relative pronouns, and certain instances of the word &amp;quot;that&amp;quot;. The substrings are then parsed, starting with the last one and working back. For each substring, we try either to parse the substring itself as one of several categories or to parse the entire set of substrings parsed so far as one of those categories. The best such structure is selected, and for  subsequent processing, that is the only analysis of that portion of the sentence allowed. The categories that we look for include main, subordinate, and relative clauses, infinitives, w'H) phrases, prepositional phrases, and noun p h rases.</Paragraph>
      <Paragraph position="1"> A simple exalnple is |,lie following, although we do not a.I)ply the technique to sentences or to fragments this short.</Paragraph>
      <Paragraph position="2"> (.h&gt;org(~ \]}US\]l, l.lie president, held a press conferen(:e yesterda.y.</Paragraph>
      <Paragraph position="3"> This sentellc(~ would be segmented a.t the conunas. First '&lt;hehl a. press conference yesterday&amp;quot; would be recognized as a VP. We next try to parse both &lt;&lt;the president&amp;quot; and &amp;quot;the presidellt, VP&amp;quot;. The string &amp;quot;the president, VP&amp;quot; would not be recognized as anything, but &amp;quot;the president&amp;quot; would be recognized as an NP. Finally, we try to parse both &amp;quot;George Bush&amp;quot; and &lt;&lt;George Bush, NP, VP&amp;quot;. &amp;quot;George Bush, NP, VP&amp;quot; is recognized as a sentence with an appositive on t.he subject.</Paragraph>
      <Paragraph position="4"> This algorithm is superior to a more obvious a.lgorithnl we had been considering earlier, llamely, to parse each fragment individually in a left-to-right fashion and then to a.ttempt to piece the fi'agments together. The latter algorithm would have required looking inside all but the last of tile fi'agments for possible attachment points. This problem of recombining parts is in general a difficulty that is faced by parsers thai, produce phrasal rather than sentential parses (e.g., Weischedel et al., 1991).</Paragraph>
      <Paragraph position="5"> ltowever, in terminal substring parsing, this recombining is not, necessary, since the favored analyses of subsequent seginents are already available when a given segment is being parsed.</Paragraph>
      <Paragraph position="6"> The effect of this terminal substring parsing technique is to give only short inputs to the parser, without losing the possibility of getting a single parse for the entire long sentence. Suppose, for exa.lnple, we are parsing a 60-word seni.ence that can be broken into six 10-word segments. At. each stage, we will only be parsing a string of ten to fifteen &amp;quot;words&amp;quot;, the ten words in the segment, phls the nonterminal symbols dominating the favored analyses of the subsequent segments. When parsing the sentence-initial 10-word substring, we are in effect parsing at most a &amp;quot;IS-word&amp;quot; string covering the entire sentence, consisting of the 10 words plus the nontermina.1 symbols covering the best analyses of the other five substrings. In a. sense, rather than parsing one very long sentence, we are parsing six fairly short sentences, thus avoiding the combinatorial explosion.</Paragraph>
      <Paragraph position="7"> Although this algorithm has given us satisfactory resuits in our development work, its nnmbers fl'om the MUC-3 evahiation do not look good. This is not surprising, given that tile technique is called on only when all else has already failed. In tile first 20 messages of the test set, terlninal substring parsing was applied to 14 sentences, ranging fl'om 34 to 81 morphemes in length.</Paragraph>
      <Paragraph position="8"> Only one of these parsed, and that parse was not good.</Paragraph>
      <Paragraph position="9"> However, sequences of fragments were found for the other 1:3 sentences. The average number of fragments was 2.6, and the sequences covered 80% of the morphelnes. None of the fragment sequences was without errors. However, eight of the 13 had three or fewer mistakes. The technique therefore allowed us to make use of much of the information in sentences that have hitherto been beyond the capability of virtually all parsers.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="189" end_page="191" type="metho">
    <SectionTitle>
3 Robust Pragmatic Interpretation
</SectionTitle>
    <Paragraph position="0"> When a sentence is parsed and given a semantic interpretation, the relationship between this interpretation and the information previously expressed in the text as well as the interpreter's general knowledge must be established. Establishing this relationship comes under tile general heading of pragmatic interpretation. The particular problems that are solved during this step include * Making explicit information that is only implicit in the text. This includes, for example, explicating the relationship underlying a coinpound nominal, or explicating causal consequences of events or states mentioned explicitly ill the text.</Paragraph>
    <Paragraph position="1"> * Determining the implicit entities and relationships referred to metonymically in the text.</Paragraph>
    <Paragraph position="2"> * Resolving anaphoric references and implicit argulnents. null * Viewing the text as an instance of a. schema that makes its various parts coherent.</Paragraph>
    <Paragraph position="3"> TACITUS interprets a sentence pragmatically by proving that its logical form follows fi'om general knowledge and the preceding text, allowing a lninimal set ot assumptions to be made. In addition, it is assuined that the set of events, abstract entities, and physical objects mentioned in the text is to be consistently minimized The best set of assumptions necessary to find such a proof can be regarded as an explanation of its truth, and constitutes the implicit information required to produce the interpretation (Hobbs, Stickel, et al., 1990). Th( minimization of objects and events leads to anaphore resolution by assuming that objects that share properties are identical, when it is consistent to do so. In the MUC-3 domain, explaining a text involves viewing it as an instance of one of a number of explanator) schemas representing terrorist incidents of various type, (e.g. bombing, arson, assassination) or one of severa: event types that are similar to terrorist incidents, bui explicitly excluded by the task requirements (e.g. an exchange of fire between military groups of opposing factions). This means that assumptions that fit into inci. dent schemas are preferred to a.ssun~ptions that do not and the schema that ties together the most assumption= is the best explanation.</Paragraph>
    <Paragraph position="4"> In this text interpretation task, the domain knowledg, performs two primary functions:  1. It relates the propositions expressed in the text t&lt; the elements of the underlying explanatory schemas 2. It enables and restricts possible coreferences fo: anaphora resolution.</Paragraph>
    <Paragraph position="5">  It is clear that nmch domain knowledge may be re quired to perform these functions successfully, but it i~ not necessarily the case that more knowledge is alway better. If axioms are incrementally added to the systen to cover cases not accounted for in the existing domaiJ  theory, it is possiMe that they can interact with the existing knowledge in such a way that the reasoning process becomes computationally intractable, and the unhappy result would be failure to find an interpretation in cases in which the correct interpretation is entailed by the system's knowledge. In a. domain as broad and diffuse as the terrorist domain, it is often impossible to guarantee by inspection that a domain theory is not subject to such combinatorial problems.</Paragraph>
    <Paragraph position="6"> The goal of robustness in interpretation therefore requires one to address two problems: a system must permit a graceful degradation of performance in those cases in which knowledge is incomplete, and it must extract as much information as it can in the face of a possible combinatorial explosion.</Paragraph>
    <Paragraph position="7"> The general approach of abductive text interpretation addresses the first problem through the notion of a &amp;quot;best interpretation.&amp;quot; The best explanation, given incomplete domain knowledge, can succeed at relating some propositions contained in the text to the explanatory schemas, but may not succeed for all propositions. The combinatorial problems are addressed through a particular search strategy for abductive reasoning described as incremental refinement of minimal.informalion proofs.</Paragraph>
    <Paragraph position="8"> The abductive proof procedure as employed by TACITUS (Stickel, 1988) will always be able to find some interpretation of the text. In the worst cause--the absence of any commonseuse knowledge that would be relevant to the interpretation of a sentence--the explanation offered would be found by a.ssunaing each of the literals to be proved. Such a proof is called a &amp;quot;minimal information proof&amp;quot; because no schema recognition or explication of implicit relationships takes place. However, the more knowledge the system has, the more implicit information ca.n be recovered.</Paragraph>
    <Paragraph position="9"> Because a minimal information proof is always available for any sentence of the text that is internally consistent, it provides a starting point for incremental refinement of explanations that can be obtained at next to no cost. TACITUS explores the space of abductive proofs by finding incrementally better explanations for each of the constituent literMs. A search strategy is adopted that finds successive explanations, each of which is better than the minimal information proof. This process can be halted at any time in a state that will provide at least some intermediate results that are useful for subsequent interpretation and template filling.</Paragraph>
    <Paragraph position="10"> Consider the following example taken'fi'om the MUC-3 text corpus: A cargo train running kom Lima to Lorohia was derailed before dawn today after hitting a dynamite charge.</Paragraph>
    <Paragraph position="11"> Inspector Eulogio Flores died in the explosion.</Paragraph>
    <Paragraph position="12"> The correct interpretation of this text requires recovering certain implicit information that relies on common-sense knowledge. The compound nominal phrase &amp;quot;dynamite charge&amp;quot; nmst be interpreted as &amp;quot;charge composed of dynamite.&amp;quot; The interpretation requires knowing that dynamite is a substance, that substances can be related via compound nominal relations to objects composed of those substances, that things composed of dynamite are bombs, that hitting bombs causes them to explode, that exploding causes damage, that derailing is a type of clamage, and that planting a bomb is a terrorist act. The system's commonsense knowledge base must be rich enough to derive each of these conclusions if it is to recognize the event described as a. terrorist act., since all derailings are not the result of' bombings. This example underscores the need for fa.irly extensive world knowledge in the comprehension of text. If the knowledge is missing, the correct interpretation cannot be found.</Paragraph>
    <Paragraph position="13"> However, if there is Inissing knowledge, all is not necessarily lost. If, for example, the knowledge was missing that lilt.ring a boml~ causes it to explode, the sysrein could still hyl.mthesize the relationship between tile charge and tile (lynamite to reason that a bomb was placed. When processing the next sentence, the system may have trouble figuring out tile time and place of Flores's death if it can't associate the explosion with hitting the bomb. However, if the second sentence were &amp;quot;Tile Shining Path claimed that their guerrillas had planted the bomb,&amp;quot; the partial infornm.tion would be sufficient to allow &amp;quot;bomb&amp;quot; to be resolved to dynamite charge, thereby connecting the event described in tile first, sentence with ~che event described ill the second.</Paragraph>
    <Paragraph position="14"> It is difficult to evahmte the pragmatic interpretation component individually, since to a great extent its success depends on the adequacy of the semantic analysis it operates on. Itowew~r, in examiuing the first, 20 messages of the MUC-3 test set. in detail, we attempted to pinpoint the reason for each missing or incorrect entry in the required templates.</Paragraph>
    <Paragraph position="15"> There were 269 sucl~ mistakes, due to problems in 41 sentences. Of these, 124 are attributable to pragmatic interpretation. We have classified their causes into a number of categories, and the results are as follows.</Paragraph>
    <Paragraph position="16">  An example of a missing simple axiom is that &amp;quot;bishop&amp;quot; is a profession. An exa.nlple of a. missing complex theory is one that assigns a default causality relationship to events that are simultaneous at the granularity reported in the text. An underconstrained axiom is one that allows, for examl)le, &amp;quot;damage to the economy&amp;quot; to be taken a.s a terrorist, incident. Unconstrained identity assumptions result from the knowledge base's inability to rule out identity of two different objects with similar properties, thus leading to incorrect anaphora resolution. &amp;quot;Combinatorics&amp;quot; simply means that the theorem-prover timed out, and the nfinimal-information proof strategy was invoked to obtain a. partial interpretation.</Paragraph>
    <Paragraph position="17"> It is difficult to evaluate the precise impact of the robustness strategies outlined here. The robustness is an inherent feature of the overall al)proach, and we did not have a non-robust control to test. it against. However, the implementation of the mhlilnal information proof search  strategy virtually eliminated all of our complete t'a.ilures due to lack of computational resources, and cut the error rate attributable to this cause roughly in half.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML