File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-0712_intro.xml
Size: 4,981 bytes
Last Modified: 2025-10-06 14:06:21
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0712"> <Title>T A van Dijk and W Kmtsch Cognitive Psychology and Discourse Recalling and Summarmmg Stones In W U Dressier, editor, Current Trends</Title> <Section position="3" start_page="0" end_page="74" type="intro"> <SectionTitle> COSY-MATS </SectionTitle> <Paragraph position="0"> The goal of the research reported here has been to develop a fleable, easdy-portable and scalable, but also efficient and robust, NLP system that automatically generates summaries of real-world unrestricted texts To tins effect, an archttecture was designed for a hybrid COnnectmmst-SYmbohc MAchine for Text Summansatlon (henceforth, COSY-1VIATS) (Axetoulah, 1996) A major concern m des~gmng COSY-MATS has been to identify content selectmn features that are generic and apphcatmn-mdependent (Section 2) The features should be apphcable to any text, Irrespective of domain or text type Tins is so that COSY-MATS 1s readdy portable to dflferent operation ehv~onments vath a nnmmum amount of cnstonnsa~ tton The isolation of such features would provzde a permanent infrastructure for both content selection and analysis The front-end text analysis modules can be developed so that they are geared towards the s,mmansatlon task, rather than text understandmg m general, wlnch is computatlonally-mteuslve Thus, these modules need only perform an analysis that 1s suiBc~ent for the evaluation of the selected content selectlon features The estabhshment of umversal unportance determlnatlon criteria means that the permanent set of analysis, interpretation, content selection and generation processors can be extended with apphcatlon-specfflc modules dunng the porting of COSY-MATS Tlns m also what renders COSY-MATS a type of summartsatson shell Slgmficantly, the computatlons of the supplementary modules wdl already be accommodated for m the standard flow of processing of the system by wrtue of these features (Section 3) Admittedly, the ldentxficatlon of content selectlon features of general apphcablhty is a very d,mcnlt task Tins is demonstrated m the lnmtatlons of the two mare trends m current surnmansatlon research (cf (Aretoulah, 1996)) There are In/ormatzon Eztrachon (Is) enwronments, wlnch perform a superiiclal and partial analysis of the input text based on the progressmn of keywords and apphcatmn-specfllc phrasal patterns thereto, e g (BT, 1994, Jacobs and Rau, 1990, L,lhn, 1958, MUC-5, 1993, Palce, 1981, Patce, 1990, Salton et al, 1994) The problem with IE systems ~s that, although they can be used very efllclently on any type of text, they are domain-dependent and hkely to produce maccurate output Ths ~s due to their excessive rehance on speclahsed content words There are also systems winch are based on Natural Language Understandsng (NLU) methods revolving deeper processmg Apart from syntactic and lexlco-semantlc analysis, the lnerarclncal rhetorical orgamsatlon of the source text can also be taken into account, as can certain aspects of the context of the dmcourse, e g (Ganghano et al, 1993, Lehnert, 1981, lVIltkov et al, 1994) Such more soplnstlcated types of system, however, are prohlbltlvely slow as a result of the extenmve processing revolved They are also very fragde, because the hlgh-level knowledge em* ployed is usually hand-coded and hence arbitrary and incomplete Even when this knowledge has been acqurred automatically, e g (Maybury, 1993, Soderland and Lehnert, 1994), it is apphcatlondependent Consequently, despite their occasional</Paragraph> <Paragraph position="2"> Neugebauer, 1995, Ono et al, 1994, Rau et al, 1993, Sharp, 1991)), NLU approaches are ---on the whole-specmhsed m a particular text-type For the demgn of COSY-MATS, a ~o~$c and umfymg approach has been adopted that revolves both extrahngmst:c, NLU-type, analysts and selective statmt~cs-based lmgmst:c processing reminiscent of IZ, m co-ordination S~mdarly to NLU, analysts m COSY-MATS is sufl~cmntly deep for the semantic, rhetorical and contextual aspects of the input text to be considered m content selection In contrast to what the case ts with such systems, however, the computation of these d~verse aspects of the text ts efficmnt Thin ts because objective cues on the surface of the text are also explmted m COSY-MATS, echoing . the I~ approach Nevertheless, unhke rE, these cues are function words'and g~nenc content words winch point towards the ingh-level functmus of the respective textual umts m the context of the dmcourse, while at the same tnne being domain-independent Thus, apart from ldent~fymg umversal content selection criteria that should render COSY-MATS portable and scalable, the research reported here has also attempted to establtsh mappings between the concrete and the more abstract criteria m the devmed feature scheme, so that the system ts also mtelhgent and pruct~cnl, ~ e so that the evaluation of these abstract criteria ts fully automated (Section 2)</Paragraph> </Section> class="xml-element"></Paper>