File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/w98-0307_intro.xml
Size: 5,008 bytes
Last Modified: 2025-10-06 14:06:38
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-0307"> <Title>Meta-discourse markers and problem-structuring in scientific articles</Title> <Section position="4" start_page="43" end_page="44" type="intro"> <SectionTitle> 2 Discourse structure and argumenta- </SectionTitle> <Paragraph position="0"> tion in scientific articles Discourse linguistic theory suggests that texts serving a common purpose among a community of users eventually take on a predictable structure of presentation (Kintsch & van Dijk 1978) - and scientific articles certainly serve a well-defined communicative purpose: they &quot;present, retell and refer to the results of specific research&quot; (Salager-Meyer 1992). Particularly in the life and experimental sciences, a rigid building plan for research articles has evolved over the years, where rhetorical divisions tend to be very clearly marked in section headers. Prototypical rhetorical divisions include Introduction, Purpose, Experimental Design, Resuits, Discussion, Conclusions. One of the reasons for this rigidly-defined structure seems to be that the scientific community in these fields has more or less agreed on how to do research: methodologies and evaluation methods are long-lived research entities that do not change often.</Paragraph> <Paragraph position="1"> One of the corpora we are using is a good example of such texts. It consists of 129 articles in cardiology, taken from the American Heart Journa/, which have a fixed structure with respect to rhetorical divisions and section headers. The other corpus, in contrast, consisting of 123 (mostly conference) articles in computational linguistics (CL), displays an heterogeneous mixture of methodologies and traditions of presentation one would expect in an interdisciplinary field. Most of the articles cover more than one single discipline, but as a rough estimate one can say that about 45% of the articles in the collection are predominantly technical in style, describing implementations (i.e. engineering solutions); about 25% report on research in theoretical linguistics, with an argumentative tenet; the remaining 30% are empirical (psycholinguistic or psychological experiments or corpus studies). Even though most of the articles have an introduction and conclusions (sometimes occurring under headers with different names), and almost all of them cite previous work, the presentation of the problem and the methodology/solution are idiosyncratic and depend on individual writing style. Very few of the headers in the computational linguistics articles correspond to prototypical rhetorical divisions; the rest contain content specific terminology (cf. Figure 1 which compares relative frequencies of headers for the two corpora). null Because the type of research reported in the computational linguistics corpus differs so much, the description of document structure we were looking for had to be flexible enough to generalize over differences in presentation, yet formal enough for the extraction of the information units which are useful for automatic abstracts. We base our model of argumentation in scientific articles on Swales' (1990) CAlkS model (&quot;Create a Research Space&quot;). Swales' claim is that the main communicative goal of an author of a research article is to convince readers (potential reviewers) that the research described in the paper constitutes an actual contribution to science, in order to have the paper reviewed positively and thus published; this is the case whether or not the paper tries to give the impression that it reports research in an objective, disinterested way. In order to successfully present their case, authors argue in a goal-directed and prototypical way about problem-solving activities -- their own and other researchers'. Swales identified prototypical rhetorical building plans of introduction sections, along with linguistics surface cues that signal rhetorical moves. Examples for rhetor- null ical moves include the claim that the paper addresses a new problem or, if it is a well-known problem, then the presented solution has to be better than that of other researchers.</Paragraph> <Paragraph position="2"> A first analysis of the corpora confirmed many of the rhetorical building blocks suggested by Swales. We adapted Swales' scheme to the one shown in Figure 8 (at the end of this paper). In the medical corpus, almost all of the moves we found were of type I (Explicit mention); the rigid document structure seems to have replaced much of the &quot;argument about problem-solving activities&quot; (types II-V) for which we found ample evidence in the computational linguistics corpus.</Paragraph> <Paragraph position="3"> We are interested in identifying these moves automatically and shallowly in text, and we believe that this is technically feasible, because the stereotypical, predictable overall structure of the argument can be exploited in doing so.</Paragraph> </Section> class="xml-element"></Paper>