File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/a88-1018_metho.xml

Size: 26,649 bytes

Last Modified: 2025-10-06 14:12:05

<?xml version="1.0" standalone="yes"?>
<Paper uid="A88-1018">
  <Title>INTEGRATING TOP-DOWN AND BOTTOM-UP STRATEGIES IN A TEXT PROCESSING SYSTEM</Title>
  <Section position="4" start_page="0" end_page="129" type="metho">
    <SectionTitle>
THE SCISOR SYSTEM
</SectionTitle>
    <Paragraph position="0"> The SCISOR system is currently being tested with newspaper stories about corporate takeovers. The domain provides interesting subject matter as well as oome rich language. The gradual development of the stories over time motivates a natural language approach, while the restricted nature of the material allows us to encode conceptual expectations necessary for top-down processing.</Paragraph>
    <Paragraph position="1"> The following is an example of the operation of SCISOR on a simple news story: W ACOUISITION UPS BID FOE WARNACO Warnaco received another merger offer, valued at $36 a share, or $360 million. The buyout offer for ~he apparel maker was made by ~he W Acquisition Corporation of Delaware.</Paragraph>
    <Paragraph position="2"> User: Who took over Warnaco? System: W Acquisition offered $36 per share for Warnaco. User: What happened to Warnaco last Tuesday? System: Warnaco rose 2 1/2 as a result of rumors.</Paragraph>
    <Paragraph position="3">  The system has been demonstrated with a small set of input texts, and is being expanded to handle large numbers of newswire stories using a large domain knowledge base and substantial vocabulary.</Paragraph>
  </Section>
  <Section position="5" start_page="129" end_page="129" type="metho">
    <SectionTitle>
SOURCES OF INFORMATION
</SectionTitle>
    <Paragraph position="0"> Text processing in SCISOR is accomplished through the integration of sources of knowledge and types of processing.</Paragraph>
    <Paragraph position="1"> The four sources of knowledge that SCISOR uses to extract meaning from text are as follows: A. Role-filler Expectations: Constraints on what can fill a conceptual role are the primary source of information used in top-down processing.</Paragraph>
    <Paragraph position="2"> B. Event Expectations: Expectations about events that may occur in the future are created from previous stories, and used to predict values in the expected events if they occur.</Paragraph>
    <Paragraph position="3"> C. Linguistic: Grammatical, lexical and phrasal knowledge is used whenever it is available and reliable. Sub-language (domain-specific) linguistic information may also be used if available.</Paragraph>
    <Paragraph position="4"> D. World Knowledge Expectations: World knowledge expectations can disambiguate multiple in- . terpretations through domain-specific heuristics.</Paragraph>
    <Paragraph position="5"> SCISOR can operate with any combination of these information sources. When one or more sources are lacking, the information extracted from the texts may be more superficial, or less reliable. The flexibility in depth of processing provided by these multiple information sources is an interesting feature in its own right, in addition to forming the foundations for a system to &amp;quot;skim&amp;quot; efficiently when a new text contains material already processed.</Paragraph>
    <Paragraph position="6"> As an example of each source of information, consider the following segment from the text printed previously: Warnaco received another merger offer, valued at $36 a share, or $360 million.</Paragraph>
    <Paragraph position="7"> Role-filler expectations allow SCISOR to make reliable interpretations of the dollar figures in spite of incomplete lexical knowledge of the syntactic roles they play in the sentence. This is accomplished because prices of stock are constrained to be &amp;quot;small&amp;quot; numbers, whereas fillers of takeover-bid value roles are constrained to be &amp;quot;large&amp;quot; quantities. Event expectations lead to the deeper interpretation that this offer is an increase over a previous offer because one expects some kind of rebuttal to an offer to occur in the future. An increased offer is one such rebuttal. World knowledge might allow the system to predict whether the offer was an increase or a competing offer, depending on what other information was available.</Paragraph>
    <Paragraph position="8"> A unique feature of SCISOR is that partial linguistic knowledge contributes to all of these interpretations, and to the understanding of &amp;quot;received&amp;quot; in this context. This is noteworthy because general knowledge about &amp;quot;receive&amp;quot; in this case interacts with domain knowledge in understanding the role of Warnaco in the offer.</Paragraph>
    <Paragraph position="9"> A robust parser and semantic interpreter could obtain these features from the texts without the use of expectations. This would make top-down processing unnecessary. Robust in-depth analysis of texts, however, is beyond the near-term capabilities of natural language technology; thus SCISOR is designed with the understanding that there will always be gaps in the system's knowledge base that must be dealt with gracefully.</Paragraph>
    <Paragraph position="10"> Now the four sources of information used to extract information are described in more detail, followed by a discussion of how they interact in the processing of two sample texts.</Paragraph>
  </Section>
  <Section position="6" start_page="129" end_page="129" type="metho">
    <SectionTitle>
A. Role-filler Expectations
</SectionTitle>
    <Paragraph position="0"> The simplest kind of expectation-driven information that can be used is termed &amp;quot;role-filler&amp;quot; expectations. These expectations take the form of constraints on the filler of a conceptual role. This is the primary source of processing power in expectation-driven systems such as FRUMP \[De-Jong, 1979\]. The following list illustrates some examples of constraints on certain fillers of roles used in the corporate takeover domain.</Paragraph>
  </Section>
  <Section position="7" start_page="129" end_page="131" type="metho">
    <SectionTitle>
ROLE FILLER-CONSTRAINT EXAMPLE
</SectionTitle>
    <Paragraph position="0"> target company-agent ACE suil;or company-agent ACHE price-per-share small number $45 total value large number $46 million This information is encoded declaratively in the knowledge base of the system. During the processing of a text, roles may be filled with more than one hypothesis; however, as soon as a filler for a role is certain, the process of elimination is used to aid in the resolution of other bindings. Thus, if SCISOR determines that ACE is a takeover target, it will assume by default that ACHE is the suitor if the two companies appear in the same story and there is no additional information to aid in the disambiguation.</Paragraph>
    <Paragraph position="1"> B. Event Expectations Expectations that certain events will occur in the future are a second source of information available to aid in the interpretation of new events. These expectations arise from the events in previous stories. For example, when the system reads that rumors have been swirling around ACE as a takeover target, an event expectation is set up that anticipates an offer for ACE in some future story. When an offer has been made, expectations are set up that some kind of rebuttal will take place. This rebuttal may be a rejection or an acceptance of the offer. The acceptance of the offer option carries with it the event expectation that the total value of the takeover will be the amount of the offer.</Paragraph>
    <Paragraph position="2"> Event expectations are implemented as domaindependent, declarative properties of the events in the do- null main. They are derived from the script-like \[Schank and Abelson, 1977\] representations of typical event sequences. C. Linguistic Analysis The most important source of information used in text processing is a full bottom-up parser. TRUMP is a flexible language analyzer consisting of a syntactic processor and semantic interpreter \[Jacobs, 1986, Jacobs, 1987a\]. The system is designed to fill conceptual roles using linguistic, conceptual, and metaphorical relationships distributed in a knowledge hierarchy.</Paragraph>
    <Paragraph position="3"> Within SCISOR, TRUMP identifies linguistic relationships in the input, using lexical and syntactic knowledge. Knowledge structures produced by TRUMP are passed through an interface to the top-down processing components. Pieces of knowledge structures may then be tied together with the expectation-driven processing components. null In the case of a complete parse of an input sentence, the knowledge structures produced by TRUMP contain most of the structure of the final interpretation, although expectations often further refine the analysis. In the case of partial parses, more of the structure is determined by role-filler expectations. The following are two simple examples of this division of labor:  In the first example above, TRUMP succeeds in producing a complete syntactic parse, along with the corresponding semantic interpretation. The domain knowledge helps only to specify the verb sense of &amp;quot;offer&amp;quot;. In the second example, however, more of the work is done by the domain-dependent expectations. In this case, the unknown words prevent TRUMP from completing the parse, so the output from the parser is a set of linguistic relations. These relations allow the semantic interpreter to produce some superficial conceptual structures, but the final conceptual roles are filled using domain knowledge.</Paragraph>
    <Paragraph position="4"> The distinction between the general offer and the more specific corp-takeover-offer is essential for understanding texts of this type. In general, an offer may be made for something to someone, but it is only in the corporate takeover domain that the target of the takeover (the for role) is by default the same as the recipient of the offer (the to role). Since TRUMP is a domain-independent analyzer, it cannot itself fill such roles appropriately. The knowledge sources at work in SCISOR and the timing of the information exchange in the system are described in the next section.</Paragraph>
    <Paragraph position="5"> D. World Knowledge Expectations If all the above sources of information are still insufficient to determine or satisfactorily disambiguate potential relationships between items in the text, so called &amp;quot;world knowledge&amp;quot; can be called into play. This world knowledge takes the form of domain-dependent generalizations, implemented as declarative relationships between concepts. For example, in the corporate takeover domain, a piece of world knowledge that can aid in the determination of what company is taking over what company is the following:  If it is ambiguous whether: Company A is taking over Company B or Company B is taking over Company A Choose the larger company &amp;quot;co be the suitor and the smaller company to be the target This example uses the knowledge that it is almost always the case that the suitor is larger than the target company. The utilization of this generalization (that typically larger companies take over smaller companies) requires knowledge of the company sizes, assumed to be present in the knowledge base of the system. Another example is: If it is ambiguous whether: value A is a previous offer or present stock price and value B is a new offer or vice versa, Choose the larger offer for the nee offer or present stock price, and the smaller offer for the previous offer In this case, a company rarely would decrease their offer unless something unexpected happened to the target company before the takeover was completed. Similarly, an offer is almost always for more than the current value of the stock on the stock market.</Paragraph>
    <Paragraph position="6"> These. heuristics incorporate expectations that arise from potentially complex explanations. For example, the reason why a new offer is higher than an old offer rests on a complex understanding of the plan of the suitor to reach their goal of taking over the target company. The world knowledge presented here represents a compilation of this complex reasoning into simple heuristics for text understanding, albeit ad hoc.</Paragraph>
    <Paragraph position="7"> Although this type of information is shown in a rule-like form, it is implemented with special relationship links that contain information as to how to compute the truth value of the relationship. When this type of knowledge is needed to disambiguate an input, the system checks if any objects have these &amp;quot;world knowledge constraints&amp;quot;. If so, they are activated and applied to the situation under consideration.</Paragraph>
    <Paragraph position="8"> The intuition underlying the inclusion of heuristics of this sort is that there is a great deal of &amp;quot;common sense&amp;quot; information that can increase an understanding mechanism's ability to extract meaning. This type of information is a last resort for determining conceptual relations when other more principled sources of information are exhausted.</Paragraph>
  </Section>
  <Section position="8" start_page="131" end_page="132" type="metho">
    <SectionTitle>
KNOWLEDGE INTEGtLkTION
</SectionTitle>
    <Paragraph position="0"> Each of the four sources of information described above is utilized at different points in the processing of the input text, and with different degrees of confidence. The following algorithm describes a particular instantiation of this order for a hypothetical event sequence involving rumors about ACE, followed by an offer by ACME for ACE.</Paragraph>
    <Paragraph position="1"> In general, event expectations are set up as soon as an event that has an expectation property is detected. That is, as soon as the system sees a rumor, it sets up an expectation that there will be an offer for the company the rumor was about sometime in the future.</Paragraph>
    <Paragraph position="2"> When that event-expectation is confirmed, those expectations are realized and the information expected is added to the meaning extracted from the text being processed. Note that these realized expectations may later be retracted given additional information. Role-filler expectations then create multiple hypotheses about which items may fill what roles. These are narrowed down by any constraints already present by event expectations.</Paragraph>
    <Paragraph position="3"> Linguistic analysis, when it provides a complete final meaning representation for a portion of the text containing features of interest, always supercedes a conflicting r~lefiller expectation. For example, if a role-filler expectation hypothesized that ACE was the target in a takeover, and the parser determined that ACME was the object of the takeover, ACME alone would be included as the target.</Paragraph>
    <Paragraph position="4"> World knowledge expectations are invoked only in the case of conflicting or ambiguous interpretations. For example, if after all the processing is finished and the system does not know whether ACE is taking over ACME or vice versa, the expectation that the larger company is typically the suitor is invoked and used in the final disambiguation.</Paragraph>
    <Paragraph position="5"> Below are the sample input texts, followed by the sequence of steps that are taken by the program.</Paragraph>
    <Paragraph position="6"> ACE, an apparel maker pl~ning a leveraged buyou~, rose $2 I/2 to $3S 3/8, as a rumor spread that another buyer might appear. The company said there were no corporate developments to account for the rise, and the rumor could not be confirmed.</Paragraph>
    <Paragraph position="7"> later on ACE received another merger offer, valued at $36 a share, or $360 million. The buyout offer for the apparel maker was made by the  ACME Corporation of Delaware. ACE closed yesterday at $3S 3/8.</Paragraph>
    <Paragraph position="8"> 1. System reads first story and extracts information that there are rumors about ACE and that the stock price is currently $35 3/8, using role-filler expectations.</Paragraph>
    <Paragraph position="9"> 2. An event expectation is set up that there will be an offer-event, with ACE as the target of the takeover offer.</Paragraph>
    <Paragraph position="10"> 3. System begins reading story involving a takeover offer and ACE.</Paragraph>
    <Paragraph position="11"> 4. Target slot of offer is filled with ACE from the event expectation.</Paragraph>
    <Paragraph position="12"> 5. An event expectation is set up that there will be a rebuttal to the offer sometime in the future.</Paragraph>
    <Paragraph position="13"> 6. System encounters ACME which it knows to be a company. Suitor slot of offer is thus filled with ACME via a role-filler expectation.</Paragraph>
    <Paragraph position="14">  7. $36 a share is parsed with the phrasal lexicon.</Paragraph>
    <Paragraph position="15"> 8. $36 a share is added as a candidate for either the stock's current price on the stock market or the amount of the ACME offer, due to role-filler expectations. null  9. $360 million is parsed with the phrasal lexicon.</Paragraph>
    <Paragraph position="16"> 10. $360 million is added as candidate for the total value of the offer due to a role-filler expectation that expects total values to be large numbers.</Paragraph>
    <Paragraph position="17"> 11. Syntactic and semantic analysis determine that the offerer is ACME, and the target is ACE. This reinforces the interpretations previously hypothesized.</Paragraph>
    <Paragraph position="18"> 12. Syntactic and semantic analysis determine the loca null tion of the ACME Corporation to be Delaware.</Paragraph>
    <Paragraph position="19"> 13. $35 3/8 is encountered, which is taken to be a price-per-share amount, due to a role-filler expectation that expects prices per share to be small numbers.</Paragraph>
    <Paragraph position="20"> 14. $35 3/8 a share is added as a candidate for either the stock's current price on the stock market or the amount of the ACME offer.</Paragraph>
    <Paragraph position="21"> 15. $35 3/8 is taken to be the stock's current price and $36 is taken to be the amount of the ACME offer, due to the world knowledge expectation that expects the offer to exceed the current trading price.</Paragraph>
    <Paragraph position="22"> The contribution of the various sources of knowledges varies with the amount of knowledge they can be brought to bear on the language being analyzed. That is, given more syntactic and semantic knowledge, TRUMP could have done more work in the analyses of these stories. Given more detailed conceptual expectations, the bottom-&amp;quot; up mechanism also could have extracted more meaning.</Paragraph>
    <Paragraph position="23"> Together, the two mechanisms should combine to produce a deeper and more complete meaning representation than either one could alone.</Paragraph>
  </Section>
  <Section position="9" start_page="132" end_page="132" type="metho">
    <SectionTitle>
IMPLEMENTATION
</SectionTitle>
    <Paragraph position="0"> SCISOR consists of a variety of programs and tools, operating in conjunction with a declarative knowledge base of domain-independent linguistic, grammatical and world knowledge and domain-dependent lexicon and domain knowledge. A brief overview of the system may be found in \[Rau, 1987c\], and a more complete description in \[Rau, 1987b\]. The natural language input is processed with the TRUMP parser and semantic interpreter \[Jacobs, 1986\].</Paragraph>
    <Paragraph position="1"> Linguistic knowledge is represented using the Ace linguistic knowledge representation framework \[Jacobs and Ran, 1985\]. Answers to user's questions and event expectations are retrieved using the retrieval mechanism described in \[Rau, 1987b\]. Responses to the user will be generated with the KING \[Jacobs, 1987b\] natural language generator when that component is integrated with SCISOR; currently output is &amp;quot;canned&amp;quot;. The events in SCISOR are represented using the KODIAK knowledge representation language \[Wilensky, 1986\], augmented with some scriptal knowledge of typical events in the domain.</Paragraph>
  </Section>
  <Section position="10" start_page="132" end_page="132" type="metho">
    <SectionTitle>
SYSTEM STATUS
</SectionTitle>
    <Paragraph position="0"> All the components of SCISOR described here have been implemented, although not all have been connected together. The system can, as of this writing, process a number of stories in the domain. The processing entails the combined expectation-driven and language driven capabilities described here. For questions that the system can understand, SCISOR retrieves conceptual answers to input questions. These answers are currently output using pseudo-natural language, but we are in the process of integrating the KING generator.</Paragraph>
    <Paragraph position="1"> SCISOR is currently being connected to an automatic source of on-line information (a newswire) for extensive testing and experimentation. The goal of this effort is to prove the utility of the system for processing large bodies of text in a limited domain.</Paragraph>
    <Paragraph position="2"> Although there will undoubtedly be many lessons in extending SCISOR to handle thousands of texts, SCISOI:t's first few stories have already demonstrated some of the advantages of the approach described here:  1. Much of the knowledge used in analyzing these stories is domain-independent.</Paragraph>
    <Paragraph position="3"> 2. Where top-down strategies fail, SCISOR can still extract some information from the texts and use this information in answering questions.</Paragraph>
    <Paragraph position="4"> 3. Unknown words (lexical gaps) and grammatical lapses  are tolerated.</Paragraph>
    <Paragraph position="5"> These three characteristics simply cannot be achieved without combining top-down and bottom-up strategies. The major barrier to the practical success of text processing systems like SCISOR is the vast amount of knowledge required to perform accurate analysis of any body of text. This bottleneck has been partially overcome by the graceful integration of processing strategies in the system; the program currently operates using only hundreds of known words. However, SCISOK is designed to benefit ultimately from an extended vocabulary (i. e. thousands of word roots) and increased domain knowledge. The vocabulary and knowledge base of the system are constantly being extended using a combination of manual and automated techniques.</Paragraph>
  </Section>
  <Section position="11" start_page="132" end_page="133" type="metho">
    <SectionTitle>
EXTENSIBILITY AND PORTABILITY
</SectionTitle>
    <Paragraph position="0"> Our research has combined some of the advantages of top-down language processing methods (tolerance of unknown inputs, understanding in context) with the assets of bottom-up strategies (broader linguistic capabilities, partial results in the absence of expectations). The system described here competently answers questions about constrained texts, uses the same language analyzer for text processing and question answering, and has been applied to other domains as well as the corporate takeover stories. SCISOR is thus a state-of-the-art system, but like  other text processing systems the main chore that remains is to allow for the practical extraction of information from thousands of real texts. The following are the main issues involved in making such a system a reality and how we address them: Lexicon design: The size of the text-processing lexicon is important, but sheer vocabulary is not of much help.</Paragraph>
    <Paragraph position="1"> What is needed is a lexicon that accounts both for the basic meanings of common words and the specialized use of terms in a given context. We use a hierarchical phrasal lexicon \[Besemer and Jacobs, 1987, Dyer and Zernik, 1986\] to allow domain-specific vocabulary to take advantage of existing linguistic knowledge and ultimately to facilitate automatic language acquisition.</Paragraph>
    <Paragraph position="2"> Grammar: A disadvantage of many approaches to text processing is that it is counterintuitive to assume that most language processing is domain-specific. While specialized knowledge is essential, a portable grammar, like a core lexicon, is indispensable. Language is too complex to be reduced to a few domain-specific heuristics. Because specialized constructs may inherit from general grammatical rules, TRUMP allows specialized sublanguage grammar to interact with &amp;quot;core&amp;quot; grammar. It is still a challenge, however, to deal gracefully with constructs in a sublanguage that would ordinarily be extragrammatical.</Paragraph>
    <Paragraph position="3"> Conceptual Knowledge: The KODIAK knowledge representation, used for conceptual knowledge in SCISOR, allows for multiple inheritance as well as structured relationships among conceptual roles. This representation is useful for the retrieval of conceptual information in the system. A broader base of &amp;quot;common sense&amp;quot; knowledge in KODIAK will be used to increase the robustness of SCISOR.</Paragraph>
    <Paragraph position="4"> Our strategy has been to attack the robustness problem by starting with the underlying knowledge representation issues. There will be no way to avoid the work involved in scaling up a system, but with this strategy we hope that much of this work will be useful for text processing in general, as well as for analysis within a specialized domain.</Paragraph>
  </Section>
  <Section position="12" start_page="133" end_page="133" type="metho">
    <SectionTitle>
FUTURE DIRECTIONS
</SectionTitle>
    <Paragraph position="0"> In the immediate future, we hope to connect SCISOR to a continuous source of on-line information to begin collecting large amounts of conceptually analyzed material, and extensively testing the system.</Paragraph>
    <Paragraph position="1"> We also plan to dramatically increase the size of the lexicon through the addition of an on-line source of dictionary and thesaurus information. The system grammar also will increase in coverage over time, aswe extend and improve the capabilities of the bottom-up TRUMP parser.</Paragraph>
    <Paragraph position="2"> Another interesting extension is the full implementation of a parser skimming mode. This mode of operation, triggered when the system recognizes input events that are identical to events it has already read about, will cause the parser to perform very superficial processing of the text.</Paragraph>
    <Paragraph position="3"> This superficial or skimming processing will continue until the parser reaches a point in the text where the story is no longer reporting on events the system has already read about.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML