File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/84/p84-1115_metho.xml

Size: 15,962 bytes

Last Modified: 2025-10-06 14:11:43

<?xml version="1.0" standalone="yes"?>
<Paper uid="P84-1115">
  <Title>from Set_C) Tab.13 Averaged Ir Ft DR-values from Set A Importance Feasibility Realization I/R I/A F/R F/A DR/R DR/A</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
AN INT~ATIONAL DELPHI POLL ON FUTURE TRENDS
IN &amp;quot;INFORMATION LINGUISTICS&amp;quot;
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> The results of an international Delphi poll on information linguistics which was carried out between 1982 and 1983 are presented.</Paragraph>
    <Paragraph position="1"> As part of conceptual work being done in information science at the University of Constance an international Delphi poll wss carried out from 1982 to 1983 with the aim of establishing a mid-term pro@aosis for the development of &amp;quot;information linguistics&amp;quot;. The term &amp;quot;information linguistics&amp;quot; refers to a scientific discipline combining the fields of linguistic data processing, applied computer science, linguistics, artificial intelligence, and information science. A Delphi poll is a written poll of experts - carried out in this case in two phases. The results of the first round were incorporated into the second round, so that participants in the poll could react to the trends as they took shape.</Paragraph>
    <Paragraph position="2"> I. Some demoscopic data I. I Return rate Based on sophisticated selection procedures 385 international experts in the field of information linguistics were determined and were sent questionnaires in the first round (April 1982). 90 questionnaires were returned. In the second round 360 questionnaires were mailed out (January 1983) and 56 were returned, 48 of these from experts who had answered in the first round. The last questionnaires were accepted at the end of June 1 983.</Paragraph>
    <Paragraph position="3"> Overlapping data in the two rounds first round (90) second round (56) 2 48 8 In the following we refer to four sets of data: Set A 90 from round I Set--B 48 from round I with answers in round 2 8et--C 56 from round 2 Set--D 48 from round 2 with answers in round I But we shall concentrate primarily on Set C becanse - according to the Delphi philosophy - t~e data of the second round are the most relevant. There were 8 persons within Set C who did not answer in the first round. But the~ also were aware of the results of the first round; therefore a Delphi effect was possible. (In the following the whole integers refer to absolute numbers; the decimal figures to relative/procentual numbers) I .2 Qualification accordin~ to academic degree The survey singled out highly competent people, as reflected in academic degree( data from A and C):</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.3 A~e
</SectionTitle>
      <Paragraph position="0"> Since Delphi polls are concerned with future developments, it has been claimed in the past that the age and experience of people in the field influence the rating. In this paper, however, we cannot prove this hypothesis. Here are the mere statistical facts, only taken from Set C (they do not differ significantly in the other--sets)  These data in particular confirm our impression that very qualified and experienced people answered the questionnaire. Almost 60% have worked longer than 10 years in the general area of information linguistics.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.5 Size of research groups
</SectionTitle>
      <Paragraph position="0"> Mos~ of those answering the questionnaire work in a research-group. Table 4 gives an impression of the size of ~ne groups in SetA and Set_C:  With respect to whether participants are mainly involved in research (defined as: basic groundwork, mainly of theoretical interest, experimental environment) or in application/development (defined as: mainly of interest from the point of view of working systems (i.e. commercial, industrial), applicable to routine tasks) the results were as follows:  indust, administ. - I I .8 puolic administration 8 8.9 4 7.1 public inf. systems 3 3.3 2 3.6 Most of the work in information linguistics so far has concentrated on English ~generally more than 80%, with slight differences in the single sub-areas, i.e. acoustic 80.6%, indexing 82.5%, question-answering83.3%).</Paragraph>
      <Paragraph position="1">  2. Content of the ~uestionnaire</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="0" end_page="545" type="metho">
    <SectionTitle>
2. I Sub-areas
</SectionTitle>
    <Paragraph position="0"> The discipline &amp;quot;information linguistics&amp;quot; was not defined theoretically but ostensively instead by a  number of sub-areas.</Paragraph>
    <Paragraph position="1"> abreviation I. Acoustic/phonetic procedures Ac 2. Morphological/syntactic procedures Mo 3. Semantic/pr~m~tic procedures Se 4. Contribution of new hardware Ha 5. Contribution of new software So 6. Information/documentation languages I1 7. Automatic indexing In 8. Automatic abstracting Ab 9. Automatic translation Tr 10. Reference and data retrieval systems Re 11. Question answering and understanding Qu</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
systems
2.2 Single topics
</SectionTitle>
      <Paragraph position="0"> The sub-areas included a varying number of topics (from 6 to 15). These topics were chosen based on the author's experience in information linguistics, on a pretest with mostly German researchers and practitioners, on advices from members of FID/LD, and on long discussions with Don Walker, Hans Karlgren, and Udo Hahn. Altogether, there were 91 topics in the first round and 90 in the second round, as follows:.</Paragraph>
      <Paragraph position="1">  e.g. for ab4: &amp;quot;procedures of text condensation that stress the overall, true-to-scale compression of a given text; although varyin~ in length (according to the degree of reduction); can be used as a substitute for original texts&amp;quot;.</Paragraph>
      <Paragraph position="2"> 3. Answer parameters for the sub-areas</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="542" type="sub_section">
      <SectionTitle>
3.1 Competence (--CO)
</SectionTitle>
      <Paragraph position="0"> At the beginning of every sub-area participants were requested to rate their competence according to three parameters &amp;quot;good&amp;quot; (with a speciaiist's knowledge), &amp;quot;fair&amp;quot; (with a working knowledge), and &amp;quot;superficial&amp;quot; (with a layman's knowledge). Tab.8 shows the self-estimation of competence within the sub-areas (data taken from SetC):</Paragraph>
    </Section>
    <Section position="3" start_page="542" end_page="543" type="sub_section">
      <SectionTitle>
3.2 Desirability (=DE)
</SectionTitle>
      <Paragraph position="0"> With respect to the application oriented subject areas the category of desirability was used in order to determine the social desirability according to the following 4-point scale: &amp;quot;very desirable&amp;quot;/++ (will have a positive social effect, little or no negative social effect, extremely beneficial), &amp;quot;desirable&amp;quot;/+ (in general positive, minor negative social effects), &amp;quot;undesirable&amp;quot;/(negative social effect, socially harmful), &amp;quot;very ur~esirable&amp;quot;/m (major negative social effect, socially not justifiable).</Paragraph>
      <Paragraph position="1"> Tab.9 (data from Set C) shows that the negative parameters (--, -)--were never or only sel~om used. Information linguistics is not judged accordir~ to the estimation of the experts - as a socially harmful scientific discipline.</Paragraph>
      <Paragraph position="2"> 4. Answer parameters for the single topics The following parameters were used as ratin~ for the sub-areas and the single topics. Their definitions were given in more detail in the questionnaire.</Paragraph>
      <Paragraph position="3">  w-un-i. --def. un-f. non-realistic These categories of scientific importance, feasibility, and date of realization were to be judged from tu~o points of view: research(=R) - defined as: basic groundwork, mainly  Competence was an important influence on evaluation. In general one can say that people with &amp;quot;good&amp;quot; competence (or more correctly: with competence estimation of &amp;quot;good&amp;quot;) in a sub-area gave topics higher ratings for importance and feasibility both from the research and the application points of view. Nevertheless, there were differences. Those with &amp;quot;good&amp;quot; competence differed more widely in evaluations of research-oriented topics than in application-oriented topics, whereas those with &amp;quot;superficial&amp;quot; competence in the sub-areas were closer to the average in their evaluations of application-oriented topics than of research-oriented topics. Here are some examples of the differences (as reflected in the averages of the sub-areas).</Paragraph>
      <Paragraph position="4"> Tab. 11 is to be read as follows: (line I) in the sub-area &amp;quot;Acoustic&amp;quot; those with &amp;quot;good&amp;quot; competence evaluated 5.6% higher than the average with respect to importance for research, whereas people with &amp;quot;superficial&amp;quot; competence in the same sub-area evaluated 6.9% lower than average.</Paragraph>
      <Paragraph position="6"> As can be seen in the column F/R, sometimes the general trend is reversed (Semantic: values from &amp;quot;competent&amp;quot; participants are lower than from participants with &amp;quot;superficial&amp;quot; competence).</Paragraph>
      <Paragraph position="7">  There is also a connection between desirability and the values of importance and feasibility. Those who gave high ratin~s for desirability (DE++) in general gave higher values to the single topics in the respective sub-areas, both in comparison to the average values and to the values of those who gave only high desirability (DE+) to a given sub-area. The differences between DE++ and DE+ are even higher than those between C/g und C/s. 0nly the F/R data in the translation and retrieval areas are lower for D++ than for D+, in all other cases the D++ values are higher. Some examples:  ~ ogether, and the values from the single topics ave oeen averaged. Exact year-datawere calculated from the answers on the 6-point rating scale, cf. Tab.10. In order to show the Delphi effect the data in Tab. 13 are taken fromSet__A, in Tab.14  The average values in Tab. 13 and 14 should not be over-interpreted. In particular, ranking is unjustified. One cannot simply conclude that, say, the sub-area &amp;quot;Semantics&amp;quot; (92.6) is more important than that of &amp;quot;Abstracting&amp;quot; (75.6) with respect to research because the average value is higher; or that Indexing (79.2) is more feasible from an application point of view than Abstracting (52.3). $uch conclusions may be true, and this is why the values in Tab. 13 and 14 are given, but the parameters should actually only be applied to the single topics in the sub-areas. Cross-group ranking is not allowed for methodological reasons.</Paragraph>
      <Paragraph position="8"> But nevertheless the It is obvious that general true: data are interesting enough.</Paragraph>
      <Paragraph position="9"> the following relation is in I/R (-values) &gt; I/A &gt; F/R &gt; F/A There are some exceptions to this general rule, such as Re-I/A&gt;I/R (both in Set A and Set C); Ha-F/R&gt;I/R (in Set C); (Re-F/R ant F/A)&gt;I/R--(in Set_C); and I1-F/R&gt;~/R(both in Set_A and SetC). There seems to be a non-trivial g~p between importance and feasibility (both with respect to research and application). In other words, there are more problems than solutions. And there is an even broader gap between application and research. From a practical point of view there is some skepsis concerning the possibility of solving important research problems. And what seems to be feasible from a research point of view looks different from an application one.</Paragraph>
      <Paragraph position="10"> The values in the second round are in general higher than in the first one. This is an argument against the oft cited Delphi hypothesis that the feedback-mechanism - i.e. that the data of the previous round are made known at the start of the following round -has an averaging effect. The increase-effect can probably be explained by the fact that the percentage of qualified and &amp;quot;com- p etent&amp;quot; people was higher in the second round perhaps these were the ones who were motivated to take on the burden of a second round) - and, as Tab.11 shows, people who rated themselves &amp;quot;competent&amp;quot; tend to evaluate higher.</Paragraph>
      <Paragraph position="11"> Between the two rounds the decline in the sub-areas &amp;quot;Software&amp;quot; and &amp;quot;Hardware&amp;quot; (apart from the parameter F/R) is striking. There is an overall increase for '%lorphology&amp;quot; and &amp;quot;Information Languages&amp;quot; for all parameters, and a dramatic increase for the topics in &amp;quot;Indexing&amp;quot; for F/R (9.7%), and a dramatic decline for the &amp;quot;Translation&amp;quot;- and &amp;quot;Question-Answering&amp;quot;-topics for the parameter F/A (9.8 and 8.4%).</Paragraph>
      <Paragraph position="12"> The dates of realization do not change dramaticallydeg On the average there is a difference of one year (and this makes sense because there was almost one year between round I and 2). There is a tendency from a research point of view for the expectation of realization to be somewhat earlier from an application standpoint. But the differences are not so dramatic as to justify the conclusion that researchers are more optimistic than developers/practitioners.</Paragraph>
    </Section>
    <Section position="4" start_page="543" end_page="545" type="sub_section">
      <SectionTitle>
5.2 Single topics
</SectionTitle>
      <Paragraph position="0"> Tab. 15 and 16 show the two highest rated topics in each sub-area in the first two columns and the two lowest rated topics in each sub-area in the last two columns. These represent average data from Set C. The four columns in the middle show the estimation of participants who work in research or application, respectively. As part of the demoscopic data it was determined whether participants work more in research or in application (cf.</Paragraph>
      <Paragraph position="1"> Tab.6). Notice that both groups answered from a research and application point of view. In a more detailed analysis (which will be published later) this- and other aspects- can be pursued. In Tab.15 and 16 the data for very high importance (*+) and high importance (+) have been added together.</Paragraph>
      <Paragraph position="3"> Tab. 16 Most feasible~ less feasible topics most feasible topics (++^+) less feasible average research application aversge(--A-)</Paragraph>
      <Paragraph position="5"> A final Table shows the data for short term and long term topics, only the two closest and the two most distant topics in each sub-area are given  Finally I would like to thank all those who participated in the Delphi rounds. It was an extremely time-consuming task to answer the questionnaire, which was more like a book than a folder. I hope the results justify the efforts. The analysis would not have been possible without the help of m~ colleagues - Udo Hahn for the conceptual desi~a, and Dr.J.Staud together with Annette Woehrle, Frank Dittmar and Gerhard Schneider for the statistical analysis. This project has been partially financed by the FID/LD-comnittee and by the &amp;quot;Bundesministerium fuer Forschung und Technologie/ Gesellschaft fuer Information und Dokumentation&amp;quot;, Grant PT 200.08.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML