File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/88/a88-1018_relat.xml
Size: 3,657 bytes
Last Modified: 2025-10-06 14:16:03
<?xml version="1.0" standalone="yes"?> <Paper uid="A88-1018"> <Title>INTEGRATING TOP-DOWN AND BOTTOM-UP STRATEGIES IN A TEXT PROCESSING SYSTEM</Title> <Section position="13" start_page="133" end_page="134" type="relat"> <SectionTitle> RELATED RESEARCH </SectionTitle> <Paragraph position="0"> The bulk of the research on natural language text processing adheres to one of the two approaches integrated in SCISOR. The practical issue for text processing systems is that it is still far from feasible to design a program that processes extended, unconstrained text. Within the &quot;bottom-up&quot; framework, one of the most successful strategies, in light of this issue, is to define a specialized domain &quot;sublangnage&quot; \[Kittredge, 1982\] that allows robust processing so long as the texts use prescribed vocabulary and linguistic structure. The &quot;top-down&quot; approach similarly relies heavily on the constraints of the textual domain, but in this approach the understanding process is bound by constraints on the knowledge to be derived rather than restrictions on the linguistic structures.</Paragraph> <Paragraph position="1"> The bottom-up, or language-driven strategy, has the advantage of covering a broad class of linguistic phenomena and processing even the more intricate details of a text. Many systems \[Grishman and Kittredge, 1986\] have depended on this strategy for processing messages in constrained domains. Other language-driven programs \[Hobbs, 1986\] do not explicitly define a sublanguage but rely on a robust syntax and semantics to understand the constrained texts. These systems build upon existing grammars, which may make the semantic interpretation of the texts difficult.</Paragraph> <Paragraph position="2"> The top-down, or expectation-driven, approach, offers the benefit of being able to &quot;skim&quot; texts for particular pieces of information, passing gracefully over unknown words or constructs and ignoring some of the complexities of the language. A typical, although early, effort at skimming news stories was implemented in FRUMP \[De:long, 1979\], which accurately extracted certain conceptual information from texts in preselected topic areas. FRUMP proved that the expectation-driven strategy was useful for scanning texts in constrained domains. This strategy includes the banking telex readers TESS \[Young and Hayes, 1985\] and ATRANS \[Lytinen and Gershman, 1986\]. These programs all can be easily &quot;fooled&quot; by unusual texts, and can obtain only the expected information.</Paragraph> <Paragraph position="3"> The difficulty of building a flexible understanding system inhibits the integration of the two strategies, although some of those mentioned above have research efforts directed at integration. Dyer's BORIS system \[Dyer, 1983\], a program designed for in-depth analysis of narratives rather than expository text scanning, integrates multiple knowledge sources and, like SCISOR, does some dynamic combination of top-down and bottom-up strategies. The linguistic knowledge used by BORIS is quite different from that of TRUMP, however. It lacks explicit syntactic struc- null tures and thus, like the sublanguage approach, relies more heavily on domain-specific linguistic knowledge. Lytinen's MOPTRANS \[Lytinen, 1986\] integrates syntax and semantics in understanding, but the syntactic coverage of the system is in no way comparable to the bottom-up programs. SCISOR is, to our knowledge, the first text processing system to integrate full language-driven processing with conceptual expectations.</Paragraph> </Section> class="xml-element"></Paper>