File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/84/p84-1083_intro.xml
Size: 4,479 bytes
Last Modified: 2025-10-06 14:04:26
<?xml version="1.0" standalone="yes"?> <Paper uid="P84-1083"> <Title>TEXTUAL EXPERTISE IN WORD EXPERTS: AN APPROACH TO TEXT PARSING BASED ON TOPIC/COMMENT MONITORING *</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> ABSTRACT </SectionTitle> <Paragraph position="0"> In this paper prototype versions of two word experts for text analysis are dealt with which demonstrate that word experts are a feasible tool for parsing texts on the level of text cohesion as well as text coherence. The analysis is based on two major knowledge sources: context information is modelled in terms of a frame knowledge base, while the co-text keeps record of the linear sequencing of text analysis. The result of text parsing consists of a text graph reflecting the thematic organization of topics in a text.</Paragraph> <Paragraph position="1"> i. Word Experts as a Text Parsing Device This paper outlines an operational representation of the notion of text cohesion and text coherence based on a collection of word experts as central procedural components of a distributed lexical grammar.</Paragraph> <Paragraph position="2"> By text cohesion, we refer to the micro level of textuallty as provided, e.g. by reference, substitution, ellipsis, conjunction and lexical cohesion (cf. HALLIDAY/HASAN 1976), whereas text coherence relates to the macro level of textuality as induced, e.g. by patterns of semantic recurrence of topics (thematic progression) of a text (cf.</Paragraph> <Paragraph position="3"> DANES 1974). On a deeper level of propositional analysis of texts further types of semantic development of a text can be examined, e.g.</Paragraph> <Paragraph position="4"> coherence relations, such as contrast, generalization, explanation (cf. HOBBS 1979, HOBBS 1982, DIJK 1980a), basic modes of topic development, such as expansion, shift, or splitting (cf. GRIMES 1978), and operations on different levels of textual macro-structures (DIJK 1980a) or schematlzed superstructures (DIJK 1980b).</Paragraph> <Paragraph position="5"> The identification of cohesive parts of a text is needed to determine the continuous development and increment of information with regard to single thematic focl, i.e. topics of the text. As we have topic elaborations, shifts, breaks, etc. in texts the extension of topics has to be delimited exactly and different topics have to be related properly. The identification of coherent parts of a text serves this purpose, in that the determination of the coherence relations mentioned above * Work reported in this paper is supported by BMFT/GID under grant no. PT 200.08.</Paragraph> <Paragraph position="6"> contributes to the delimitation of topics and their organization in terms of text grammatical well-formedness considerations. Text graphs are used as the resulting structure of text parsing and serve to represent corresponding relatlons holding between different topics.</Paragraph> <Paragraph position="7"> The word experts outlined below are part of a genuine text-based parsing formalism incorporating a llnguistical level in terms of a distributed text grammar and a computational level in terms of a corresponding text parser (HAHN/REIMER 1983; for an account of the original conception of word expert parsing, cf. SMALL/RIgGER 1982). This paper is intended to provide an empirical assessment of word experts for the purpose of text parsing. We thus arrive at a predominantly functional description of this parsing device neglecting to a large extent its procedural aspects.</Paragraph> <Paragraph position="8"> The word expert parser is currently being implemented as a major system component of TOPIC, a knowledge-based text analysis system which is intended to provide text summarization (abstracting) facilities on varlable layers of informational speclfity for German language texts (each approx.</Paragraph> <Paragraph position="9"> 2000-4000 words) dealing with information technology. Word expert construction and modification is supported by a word expert editor using a special word expert representation language fragments of which are introduced in this paper (for a more detailed account, cf. HAHN/REIMER 1983, HAHN 1984). Word experts are executed by interpretation of their representation language description.</Paragraph> <Paragraph position="10"> TOPIC's word expert system and its editor are written in the C programming language and are running under UNIX.</Paragraph> </Section> class="xml-element"></Paper>