File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/80/c80-1027_metho.xml
Size: 44,430 bytes
Last Modified: 2025-10-06 14:11:16
<?xml version="1.0" standalone="yes"?> <Paper uid="C80-1027"> <Title>LINGUISTIC ANALYSIS OF NATURAL LANGUAGE COMMUNICATION WITH COMPUTERS</Title> <Section position="1" start_page="0" end_page="192" type="metho"> <SectionTitle> LINGUISTIC ANALYSIS OF NATURAL LANGUAGE COMMUNICATION WITH COMPUTERS </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="192" type="sub_section"> <SectionTitle> Summary </SectionTitle> <Paragraph position="0"> Interaction with computers in natural language requires a language that is flexible and suited to the task. This study of natural dialogue ~as undertaken to reveal those characteristics which can make computer English more natural. Experiments were made in three modes of communication: face-to-face, terminal-toterminal and human-to-computer, involving over 80 subjects, over 80,000 words and over 50 hours. They showed some striking similarities, especially in sentence length and proportion of words in sentences. The three modes also share the use of fragments, typical of dialogue.</Paragraph> <Paragraph position="1"> Detailed statistical analysis and comparisons are given. The nature and relative frequency of fragments, which \]lave been classified into twelve categories, is shown in all modes. Special characteristics of the face-to-face mode are due largely to these fragments (which include phatlcs employed to keep the channel of communication open). Special characteristics of the computational mode include other fragments, namely definitions, which are absent from other modes. Inclusion of fragments in computational grammar is considered a major factor in improving computer naturalness.</Paragraph> <Paragraph position="2"> The majority of experiments involved a real life task of loading Navy cargo ships. The peculiarities of face-to-face mode were similar in this task to results of earlier experiments involving another task. It was found that in task oriented situations the syntax of interactions is influenced in all modes by this context in the direction of simplification, resulting in short sentences (about 7 words long). Users seek to maximize efficiency In solving the problem. When given a chance, in the computational mode, to utilize special devices facilitating the solution of the problem, they all resort to them.</Paragraph> <Paragraph position="3"> Analyses of the special characteristics of the computational mode, including the analysis of the subjects&quot; errors, provide guidance for the improvement of the habitability of such systems. The availability of the REL System, a high performance natural language system, made the experiments possible and meaningful. The indicated improvements in habitability are now being embodied in the POL (Problem Oriented Language) System, a successor to REL.</Paragraph> <Paragraph position="4"> I. Introduction The research reported on is part of a larger project aimed at improving the interaction of humans with computers in a language that is natural for the user. In real life applications of computers the language is natural in a very specific sense, since it is constrained by the linguistic and situational context and sub-Ject to the inevitable restrictions of the computational grammar and the general requirements of this mode of interaction. However if computational interaction is to be natural, forms of language which are natural in normal dialogue as well as those particularly suited to the application should be available to the user. A very important requirement is that the means of communication be flexible, that the user should be able to modify the language so as to best serve the solution of the problem. Another issue is to what extent the computer should act as a natural party to the Interaction. Naturalness of human-computer interaction is often refered to as a system's habitability.</Paragraph> <Paragraph position="5"> This research was undertaken upon the belief that investigation of human dialogue (both spoken and written) and analysis of human-computer interaction is essential to determine how good habitability can be achieved.</Paragraph> <Paragraph position="6"> Initial research was done on human-to-human dialogue, both in the face-to-face mode in totally free (but voice only) interaction and in the written mode where the dialogue was via computer terminals linked to a computer system, but where the interaction was in unrestricted English. Initial research involved the solution of a relatively simple though quite realistic problem. It confirmed some expected differences between the two modes of communication, but also revealed some surprising similarities. An extremely important result and one that proved particularly challenging to obtain was the identification and definition of structures other than sentences used in natural unrestricted communication. These were finally reduced to about a dozen categories. The next stage of research involved a real-life task and the data was the same for three modes: face-to-face, terminalto-terminal, and human-to-computer. Results for the first two modes were closely comparable to the previous results and were also compared with results from the computational mode. Again, what is more striking and worthy of interest are the similarities rather than the differences.</Paragraph> <Paragraph position="7"> Some of the major similarities are in sentence length, percentage of words in sentences (as against fragments), number of sentences (for termlnal-to-termlnal and human-computer mode), ~ery high number of sentences containing beverbs, and low number of sentences containing relative pronouns.</Paragraph> <Paragraph position="8"> In this paper, the focus is on (I) the comparison of statistics obtained for the three - 190 ,nodes; (2) the nature and relative frequency of fragments and their implications for computational habitability; and (3) detailed discussion of the characteristics of the computational interactions.</Paragraph> <Paragraph position="9"> The research involved over i00 subjects in years 1975, 1977 and 1979/80. The subjects were predominately undergraduate and graduate students at Caltech. This work resulted in an enormous amount of data, requiring a great deal of time for analysis. Since each protocol was scored by at least two people (and usually more), averaging out the scores was also time consuming. Total time spent by subjects in experiments was over 50 hours, which yielded for flnal comparisons 20 face-to-face protocols, II termlnal-to-terminal, and 21 human-to-computer, containing over 80,000 words. Protocols of over 20 subjects in face-to-face and terminal-toterminal mode were analyzed for categories and partial statistics~ and thus not included in the final results.</Paragraph> <Paragraph position="10"> The main thesis of this paper is that in problem solving situations ordinary conversation and human-computer conversation in a system that allows relative natural language, share several important features, and that we can improve computer habitability by learning about the nature of ordinary conversation, which exhibits rather well defined and identifiable structural patterns. null II. Early Experiments In an interesting paper on natural human dialogue in a problem solving situation I, it was noted on the basis of extensive experiments that &quot;people do not naturally speak in sentences&quot; and that in general great unruliness characterizes interactive communication, whether spoken or written. At first sight of the protocols, one tends to confirm the impression. But a closer look both at the same protocols and at the results of analysis cited, as well as some informal observation of other conversations, and the reflection that communication would hardly be achievable in such an absence of rules, led me to a hypothesis that there is considerable order in natural conversation. I designed experiments in the summer of 1975 and they were conducted in the fall with the assistance of students in a course in Soelolinguistics at Caltech. Additional experiments were conducted in 1977. These experiments are discussed in some detail since they provided guidelines for future research. They differ from the later experiments in the fact that subjects used as much time as was needed for the solution of the experiment, while in the later ones an arbitrary cut-off was imposed. The problem was that of locating the nearest doctor to a patient's address, given a map of Pasadena and a selected llst of doctors. Each experiment involved two subjects, one being given the map of Pasadena with the patient's address marked on it (3 different locations were used, but only one in a given experiment) and the other the list of doctors. In the face-to-face mode, the conversations were tape-recorded and transcribed. Sub-Jects were free to communicate by voice but were not allowed to look at each others&quot; materials. Typically, they were seated at the ends of a fairly large table with the tape-recorder between them, and the experimentor in the room.</Paragraph> <Paragraph position="11"> The experimentor provided the materials and instructions, and answered some initial questions only. In tile terminal- to-terminal mode~ the subjects were in separate rooms and the protocols were recorded and merged with a computer program. The subjects were free to communicate in ordinary English but had to observe some minimal typographical conventions (such as sending the message in by using two keys simultaneously). The role of the experimentor was the same, but divided between the subjects and occasionally offering assistance with computational requirements. The subjects were fully aware that they were conversing with a human counterpart (in this, these experiments differed from Bill Martin's. 2 For the purposes of this paper, the results of 12 face-to-face experiments and 7 termlnal-to-terminal are used.</Paragraph> <Paragraph position="12"> The problems of analysis were severe, but the results gratifying. As noted by other investigators, conversational English has a great deal of characteristics which call for an approach very different from the analysis of well formed single sentences. What is obvious is that there are many strings which are not sentences but &quot;incomplete&quot; or &quot;unfinished&quot; or ungrammatical in a variety of ways. But the difficulty of even deciding what a sentence is has also been noted (3): &quot;...the sentence is not, strictly speaking, a unit in oral discourse. One can see texts \[n which long sequences of clauses linked by &quot;and then...&quot; occur. Are these separate sentences or one sentence?&quot; After considerable reflection and search for guidance from the literature, a rather conventional notion of sentence was used, with the requirement that it contain a NP and a gP and that it be within the confines of a single message (a message being the utterance(s) of one speaker). Semantic considerations (admittedly often inevitably intuitive) were used to determine single or multiple sentencehood.</Paragraph> <Paragraph position="13"> Coordinating conjunctions and sequences such as &quot;and then&quot;, pauses or phatic units (these being defined as any strings keeping the channels of communication open) often signalled separate sentences. Words such as &quot;because&quot; or &quot;if&quot; tied strings into a single sentence. An additional semantic requirement (again admittedly vague) was that a sentence could stand alone as a unit and make sense. These criteria worked quite well as evidenced by the counts made by different scorers of the same protocols.</Paragraph> <Paragraph position="14"> An interesting category of sentences that emerged were the so-called transposed sentences, e.g., &quot;The two small streets there might be doctors on.&quot;p &quot;Conwire, I also hate.&quot;, &quot;4, is it?&quot;, &quot;That's the tallest thing we got, special weapon?&quot;, &quot;Length by width, does it matter?&quot;. Although they are infrequent, such sentences contribute considerably to the distinctive impression made by ordinary conversation. Due to their low frequency, only a partial analysis of these was made. (In four protocols, they amounted to about 2.5% of the total of sentences.) The first three examples above show only word rearrangement, but the others contain pronouns substituting for the transposed NP, in one case preceding the NP, in the other following it.</Paragraph> <Paragraph position="15"> Some problems were encountered in the consideration of what was a word in numbers, abbreviat ions, alphanumeric strings * In general numbers and abbreviations were considered one word; in alphanumeric strings, a number was one word, and a character string another. Phaties such as &quot;uh&quot;, &quot;urn&quot;, &quot;uhuh&quot;, as well as &quot;okay&quot;, even when abbreviated to &quot;O.K.&quot;, were considered one word, and as multiple words when obviously so, as in &quot;you know&quot;, &quot;I see&quot;. Differences in exact word counts were not large enough to be significant, and often coincided surprisingly well * The most severe problem was, naturally, the definition of fragments. Even though a fairly clear classification was formulated at the end of the analysis of the 1977 experiments, fragments and phatics are discussed in conjunction with the 1979/80 results. Their definition was refined and some categories were reformulated, and comparisons are made with computational protocols. The role and desirability of these fragments in natural conversation is also discussed then.</Paragraph> <Paragraph position="16"> The most significant results of the early experiments are summed up in Table I.</Paragraph> <Paragraph position="17"> sentences close to 70% over 70% \[II. Three Modes of Communication: A Comparison i. The Experimental Setting The setting in the 1979/80 experiments in the F-F and T-T modes was similar to that described in Section II, but with major differences in the overall design of the experiments. First, three modes of communication were used, the third being human-to-computer. The task was a real life task of loading cargo onto a ship, the data being from the real environment of loading U.S.</Paragraph> <Paragraph position="18"> Navy ships by a group located in San Diego, California. In the first two modes, one subject was provided with a list of cargo items to be loaded (along with their quantities) and a llst of decks, their sizes, and their primary uses.</Paragraph> <Paragraph position="19"> The other subject was given a list of the sizes of the cargo items. The subjects were instructed to obey space and other limitations (e.g., hatch size) and restrictions as to what cargo could be stowed on what decks. There was a time limit of one hour in both modes. The task of transcribing F-F recordings was very laborious due primarily to the specific Jargon and numerous abbreviations in the data. In the T-T mode, the protocols were obtained automatically. null For the human-computer mode, the REL System was used. 4-6 This system, developed by our pro-Ject, provides the means for communicating with a large data base in a limited but useful style of natural English, described in detail in. 6 The response times to user queries are quite reasonably short so that natural interaction is possible. Requests which are not understood are diagnosed extremely quickly, thus encouraging the user to try alternate ways of phrasing.</Paragraph> <Paragraph position="20"> This technique was indeed employed frequently, as discussed in Section IV.</Paragraph> <Paragraph position="21"> It has always been the REL System's philosophy that naturalnesss of a language is obtained in two primary ways: task-specificity and flexibility for modifications. Taskspecifity can be achieved only by actual study of the users&quot; needs (and, obviously, by incorporating their data in the system). The capabilities of REL English have already been extended to make the language more natural for this specific task, notably by developing a prompting &quot;load sequence&quot; (and &quot;offload sequence&quot;) in which the computer elicits the information from the user, and offers clarification if the prompt is not clear. This device was used extensively by the subjects, but its description is left out due to space limitations. null The other major ingredient of naturalness is enabling the user to suit the language to the task by incorporating his specific knowledge and jargon. To do this, the user must be able to extend the language through definitions and make other modifications. This, also, was done by the subjects and is discussed in Section IV.</Paragraph> <Paragraph position="22"> The experimental setting was obviously very different. One subject at a time was assigned the task. No precise time limit was set, but most subjects were given two hour time slots, some of which was spent in in\[tiallzing the computational session. The subject's session on the average lasted one and a half hours. The subjects were given a llst of the cargo items to be loaded and the number of each, as well as the primary uses of the decks. They were instructed that they should attend to fitting the cargo through hatch sizes and to keep track of space loaded. All the pertinent data about the cargo and ships was in the computer. The subjects were also given a short manual on the loading of ships, with examples of how to use the system and English, including arithmetic, definitions and load sequences. The experimenter helped the subject get started and assisted in case of computational problems in about half of the cases, others working alone. Although the subjects were instructed to read the manual before commencing experiments, the analysis of protocols showed that few had actually familiarized themselves with the system.</Paragraph> </Section> </Section> <Section position="2" start_page="192" end_page="193" type="metho"> <SectionTitle> 2. The Structure of Face-to-Face Dialogues </SectionTitle> <Paragraph position="0"> Some working definitions need to be stated here.</Paragraph> <Paragraph position="1"> Messages and sentences were discussed in Section II. Fragments are all of the dialogue material that is not in sentences, and Phatics, which constitute a big subgroup of fragments, are all strings which serve a variety of functions which may all be characterized as keeping the channel of communication open (including expressions of emotions to the other subject and the computer).</Paragraph> <Paragraph position="2"> A page from a dialogue in Figure 1 illustrates some of the problems in analysis, and gives an idea of some of the categories of fragments, since it contains a rather large number of them. The categories are defined after the discussion of the page. Abbreviations are:</Paragraph> <Paragraph position="4"> there. Ammunition, pyrotechnics, special weapons, vehicles and things on pallets.</Paragraph> <Paragraph position="5"> type. \[FS\] 9 A Oh, that's an ammunition. \[P\] i0 B It's an ammunition. \[E\] \[i A Yeah. \[P/TR\] 12 B Uh hum, and some conwire? \[P, C, TQ\] 13 A Conwire, I also have. That's... That's a pallet. You want a subclasslfication, or is that good enough? \[TRANS, FS\] 14 B No, no, pallet's fine. \[TR, TR\] 15 A Okay. \[P\] \[6 B A CTG. \[TQ/TI\] 17 A CTG is more than I.</Paragraph> <Paragraph position="6"> \[8 B Oh, Okay. 105 SMK. iF, P, TI\] 19 A Is that a CTG 105 SMK? 20 B It is indeed.</Paragraph> <Paragraph position="7"> 21 A Okay. 2 pages of CTGs. CTG 105 ... SMK? \[P, SELF, TQ\] 22 B Yeah \[TR\] 23 A SMK, that's a pyrotechnic. \[TRANS\] 24 B Okay, and 105 WP. \[PC, TQ\] 25 A 105 WP. \[E\] 26 B A CTG 105 WP. \[ADD\] 27 A Let's see. An APE or HE? Would it help if I read this to you? \[P, TQ\] 28 B Alright, makes sense in certain ways. \[rR\] 29 A There's the WP, I'm sorry. It's in pyrotechnic also. \[P\]\] 30 B Okay. \[P\] 31 A I can tell you what's in ammunition if that would help. We've got a CTG 106 APE. 32 B Okay. \[P\] 33 A CTG 105 HE. \[TI\] 34 B I or 2? \[TQ\] 35 A Both. Both I and 2. Then also in ammunition I have a CTG 40HE and a 60HE. \[TR, ADD, C, TRANS\] 36 B A 60HE. \[E\] 37 A Yeah. It seems to be a 60HE. \[P\]</Paragraph> <Paragraph position="9"> The rest is ADD(ed information). B's first mes~ sage contains a P(hatic) and S of 6. M3 is a P.</Paragraph> <Paragraph position="10"> M4 is either a T(erse) Q(uestion) or T(erse) l(nformatlon). Next we have an E(cho), followed by a P and a S of 9. M6 is C(onnector) and TQ.</Paragraph> <Paragraph position="11"> M7 is either INT(errupted) or TRUN(cated). M8 contains a F(alse) S(tart) and a S of 6. M9 is a P and an S of 3. MI0 is a S of 3, however on semantic grounds it could be considered an echo. The rule was adopted that a sentence echo was considered a sentence. MII is either P or T(erse) R(eply), more likely the former, but the analysis in general would not be greatly affected by either choice. Mi2 starts with one or two Ps, more likely two, has a C and TQ. MI3 is a TRANS(posed sentence), followed by FS and S of 3. Next we have either a S of 8, or two Ss, one of 3 and one of 4 and a C. Such sentences are fortunately infrequent. The general tendency was to separate such sequences unless semantic ties were strong. Again the influence on the overall analysis would not be great. Mi4 contains two TRs or a TR and a P, and a S of 3.</Paragraph> <Paragraph position="12"> Next line is a P. M16 is again either TQ or TI.</Paragraph> <Paragraph position="13"> Next is a S of 5. Next llne ls two Ps and TI, next two are Ss of 6 and 3 respectively, although the latter could be considered a phatic. M2I is a P followed by SELF(talklng to oneself) and a TQ. Next is a TR. Next line is a TRANS. M24 is a P, C and TQ, next line is E.</Paragraph> <Paragraph position="14"> M26 is ADD. M27 is a P, a TQ, followed by a S of 9. Next a TR and a S of 5. It is problematic whether this should be a S. There are a number of possibilities. It could be P, could be ADD. Not many such decisions fortunately had to be made. The presence of the verb and the idiomatic character weighed toward sentencehood Ln this case. M29 is a ~ of 3, a P, a S of 4.</Paragraph> <Paragraph position="15"> M30 is P. M31 is a S of II, followed by a S of 6. The former is typical of complex sentences with strong semantic ties. Next M is P, next TI, next TQ, next TR, ADD, C and TRANS of 13.</Paragraph> <Paragraph position="16"> Next is E, and the last one a P followed by a S of 7.</Paragraph> <Paragraph position="17"> The working definitions for fragments and phatlcs are: TQ (Terse Questlon): An elliptical question usually containing no VP, hut often having a NP, e.g., &quot;Why?&quot;, &quot;How about pyrotechnics?&quot; (&quot;How about NP?&quot; is quite common), &quot;~lich ones?&quot;. TR (Terse Reply): An elliptical reply, also often Just a NP, e.g. , &quot;No.&quot;, &quot;Probably meters.&quot;, &quot;50 and 7.62.&quot;.</Paragraph> <Paragraph position="18"> TI (Terse Information): A rather elusive category, neither question, reply nor command, an elliptical statement but one often requiring an action. Examples can be appreciated in context only (Figure i). It brings to mind Austin's How to Do Things with Words. 9 E (Echo): An exact or partial repetition of usually the other speaker's string. Often an NP, but it may be an elliptical structure of various forms. A distinction was made at an earlier time between echo, self-echo, and echoquestion but was abandoned. Only fragmentary echos (rather than whole sentences, which were far less common) were included.</Paragraph> <Paragraph position="19"> ADD (Added Information): An elliptical structure, often NP, used to clarify or complete a previous utterance, often one's own, e.g., &quot;It doesn't say anything here about weight, or breaking things down. Except for the crushables.&quot;, &quot;It's smaller. 36&quot;X20&quot;XiT&quot;.&quot;. Spe\[llng out words was included here.</Paragraph> <Paragraph position="20"> ~UN (Truncated): An incomplete utterance, voluntarily abandoned.</Paragraph> <Paragraph position="21"> INT (Interrupted): One involuntarily abandoned.</Paragraph> <Paragraph position="22"> These two are often hard to distinguish, but truncation is clear if the speaker abandons his utterance, e.g., &quot;Uh, some of these are ... I don't know ~lat category they wlll go in.&quot;, and interruption is clear when one speaker Jumps over the other's utterance which shows signs of intent at continuation, e.g., &quot;A: Maybe we should work on some of the bigger things. B: Yeah, I think that A: Let's try some of the bigger decks .... &quot;.</Paragraph> <Paragraph position="23"> FS (False Start): These are also abandoned utterances, but immediately followed by usually syntactically and semantically related ones, e.g., &quot;They may, they may be identical classes.&quot;, &quot;Well, the height, the next largest height l've got is 34.&quot;.</Paragraph> <Paragraph position="24"> COMP (Completion): Completion of the other speaker's utterance, distinguished from interruption by the cooperative nature of the utterance, e.g., &quot;A: l've got a lot of...l've got B: 2 pages. A: Yeah.&quot;.</Paragraph> <Paragraph position="25"> CORR (Correction): This may be done by either speaker. If done by the same speaker it is related to false start, but semantic considerations suggest a correction, e.g., &quot;Those are 30, uh, 48 length by 40 width by 14 height.&quot;.</Paragraph> <Paragraph position="26"> SELF (Talking to Oneself): Fragments, sometimes mutterings, even to the point of undecipherabil\[ty, not intended for the other person, but rather thinking aloud reminiscent of Piaget's &quot;collective monologue&quot;, I0 e.g., &quot;Ummm - 7 7 8 5 and 14 - 7 7 8 will certainly add up to 22 wouldn't it or I guess.&quot;.</Paragraph> <Paragraph position="27"> P (~hatics): The largest subgroup of fragments whose name is borrowed from Mallnowskl's II term &quot;phat\[c communion&quot; with which he referred to those vocal utterances that serve to establish social relations rather than the direct purpose of communication. This term has been broadened to include all fragments which help keep the channel of communication open, such as &quot;Well&quot;, &quot;Wait&quot;, but even &quot;You turkey&quot;. Two sub-categories of phatics are: C (Dialozue connectors): Words such as &quot;Then&quot;, &quot;And&quot;, &quot;Because&quot; (at the beginning of a message or utterance).</Paragraph> <Paragraph position="28"> T (Tag questions): e.g., &quot;They're all under 60, aren't they?&quot; In the discussion above, the words &quot;speaker&quot; and &quot;utterance&quot; were used; but since most of these fragments are found also in the termlnal-to-termlnal mode and some also in the computatlonal mode, they apply also to typed interactions.</Paragraph> </Section> <Section position="3" start_page="193" end_page="196" type="metho"> <SectionTitle> 3. Statistical Analysis of the Three Modes </SectionTitle> <Paragraph position="0"> The analysis beta Is based on the 1979/80 experiments only since they all involve the same shiploading task. The results were scored in each case by at least two persons, and the computational mode protocols by five. There are 8 face-to-face, 4 terminal-to-termlnal and 21 human-to-computer protocols, involving 44 sub-Jects. The time, was one hour each for the first two modes, and an average of one and one half hours for the third. Since there were twice as many F-F protocols as T-T and almost twice as many H-C as the first two combined, statistical totals are not very important. They are given here however to yield strength to the final processed comparisons.</Paragraph> <Paragraph position="1"> The analysis of computational protocols clearly necessitated some different methodologies, and some data is simply not comparable (e.g., load sequences, since they were absent in F-F and T-T). The category &quot;message&quot; was split into &quot;parsed message&quot; and &quot;parsed and nonparsed message,&quot; the first comprised of parsed inputs and the second of all inputs. The fragments also consisted of parsed ones: terse question, terse reply and definitions, and nonparsed ones: false starts and phatics. The terms &quot;message&quot; and &quot;fragment&quot; for the values in H-C refer to parsed messages and parsed fragments. Unless indicated otherwise, &quot;fragments&quot; in general do not include phatics, connectors and tags. Load sequences were completely left out of analysis, and obviously no computer answers were analyzed.</Paragraph> <Paragraph position="2"> The statistics show some expected marked differences as to the number of words, messages, sentences, fragments and phatics. The face-to-face mode is not surprisingly much more verbose, and shows a much higher ratio of phatics. What is however far more interesting is that several statistics are close to each other: those for sentence length, message length, fragment length (excluding deflnitions in H-C, since they are absent in the other two), percentage o~ words in sentences, especially for F-F and T-T, percentage of words in fragments, again especially for F-F and T-T. The latter two are of interest since in the H-C mode the percentage of words in sentences is higher and in fragments is lower, even though the system allows use of fragments.</Paragraph> <Paragraph position="3"> As for sentence length, Chafe I0 cites the &quot;idea unit&quot; in spoken as having a mean length of ~out 6 words. These numbers bring to mind George Miller's 12 &quot;magical number 7&quot;. Also noticeable is a striking closeness between average of messages in T-T and parsed and nonparsed inputs in H-C. The ratio of sentence/message are close for the 3 modes, and the ratios of fragment/message are close for F-F and T-T. Nor surprisingly, the ratio of phatic/message are different, being particularly low for H-C.</Paragraph> <Paragraph position="4"> Fragments are of particular interest and therefore are analysed in further detail. Fragments are considered separate from phatics.</Paragraph> <Paragraph position="5"> Nonparsed fragments in H-C are included in this analysis. TRUN and INT are collapsed into TRUN.</Paragraph> <Paragraph position="6"> As Table 4 shows TR is the predominant fragment in all three modes. (H-C mode characteristics are discussed in Section IV.) The next is ECHO for F-F, TI for T-T and TQ for H-C, and TQ is rather high in all three modes. These may seem to have little in common, but they are all typicaly NPs. The percentages for FS are close in all three modes, particularly so in F-F and H-C.</Paragraph> <Paragraph position="7"> The absence of some categories in some modes is equally interesting, even though totally understandable in some cases. The low presence of CORR in F-F and its absence in T-T is suprising, but may be partly due to some overlap of this category with FS. The absence of SELF and TAG in T-T and H-C is understandable, as is the absence of DEF(definltions) in F-F and T-T. It should be noted that in T-T the category did occur in a way. The subjects used a good deal of abbreviation in spelling (a common type of DEF is abbreviation) and also conventions, which every pair invented for end of message signal.</Paragraph> <Paragraph position="8"> ECHO and COMP in H-C would be rather silly -who would echo or complete the computec? But the absence of ADD, CORR, TI and CON is due to the restraints of the grammar. Their role and desirability in H-C is further discussed in Section V.</Paragraph> <Section position="1" start_page="194" end_page="194" type="sub_section"> <SectionTitle> Dialogue Connectors Tag Questions </SectionTitle> <Paragraph position="0"> Phat\[cs deserve a separate detailed discussion on account of their varied semantic functions but it is beyond the bounds of this paper.</Paragraph> <Paragraph position="1"> By faro the most common phatic is Okay. It is interesting that speakers do not seem to be aware of this. When I asked my class in psycholinguistics (over 15 students) which phatlc they thought most frequent, a variety of answers was given, but none came up with Okay. Table 5 shows the percentages of the top 5 phatics. In H-C several phatlcs occurred, but only 3 &quot;Okay&quot;s and one &quot;Oh well&quot; of the tope five. They are illustrated below and discussed in Sections IV and V. Table 5 also gives percentages for the top five dialogue connectors. There are none in goddammit, bleah, oops, forget it, you're k\[dding, fool, yuk, you nitwit, what a pity, Just a sac.</Paragraph> <Paragraph position="2"> From T-T: bleep, more to come, ook, ook to you, congrtltns, cmt => grt idea, stand by, you turkey (&quot;look&quot; occurred in 3 protocols, which is quite interesting considering the mode).</Paragraph> <Paragraph position="3"> From H-C: yes, I know how you feel, no, are you a computer?, of course, ?, foo to you, what is your problem?, there must be a better way, bla...bla, why don't you understand my question? help, where are we machine?, you lie, good, thank you.</Paragraph> <Paragraph position="4"> IV. The Human-to-Computer Mode:</Paragraph> </Section> <Section position="2" start_page="194" end_page="196" type="sub_section"> <SectionTitle> Special Characteristics </SectionTitle> <Paragraph position="0"> i. Performance of the System The system performance was such that meaningful work could be accomplished by largely uninitiated subjects with a bare minimum of assistance. Response to inputs which were not understood was extremely fast, the incidence of bugs was low (out of 1615 messages, 12 hit bugs) and recovery from them was excellent. Response times were quite adequate, especially since many requests involved quite a bit of computation.</Paragraph> <Paragraph position="1"> The subjects never showed impatience or boredom, but apparently used the latency time (from input to response) to formulate the next request.</Paragraph> <Paragraph position="2"> 2. The Influence of the Specific Task The special task at hand and the special character of a problem solving situation both have an influence on the performance of the sub-Jects. The &quot;prompt sequence&quot; for loading the ship provided in the language was used by all subjects even though they could have accomplished the same thing by natural dialogue (the magical number &quot;7&quot; shows up again here in the average of 7.6 loading sequences per protocol).</Paragraph> <Paragraph position="3"> The percentage of items loaded is lower than in the F-F but this is due to the considerably longer initial orienatation period in H-C (from 1/2 to I hour), after which the rate of loading increases. About 50% of items were loaded in F-F in one hour, so the task is completable In about two hours. About 20% of the \[terns were loaded in H-C, but considering that the rate of loading increased in the last half hour of the sessions, the task was also doable in about 2 hours. The solution of the problem was not however of interest in these experiments. The influence of the problem solving situation was very evident, particularly on syntax. The question (request) -- response interchanges are dominant in all modes. Rather short sentences used are also attributable to this. Fragments are useful for increasing the flow of information.</Paragraph> <Paragraph position="4"> Phatics facilitate interaction.</Paragraph> </Section> </Section> <Section position="4" start_page="196" end_page="196" type="metho"> <SectionTitle> 3. Syntax </SectionTitle> <Paragraph position="0"> The types of sentences used is of particular interest here, so detailed analysis was made wit\]\] respect to sentence structure and type.</Paragraph> <Paragraph position="1"> The results are summarized in Table 6.</Paragraph> <Paragraph position="2"> &quot;List the class of each cargo.&quot; 71 8.0 Sentences with conjunctions, e.g., &quot;What is the maximum stow height and bale cube of the pyrotechnic locker of the AL?&quot; 88 10.0 Sentences wit\]\] quantifier and conjunction(s), e.g., &quot;List hatch width and hatch length of each deck of the Alamo.&quot; 23 2.6 Sentences with relative clause, e.g., &quot;List the ships that have water.&quot; 6 .7 Sentences with relative clause (or related construction) and comparator, e.g., &quot;List the ships with beam less than 1000.&quot; 6 .7 Sentences with quantifier and relative clause, e.g., &quot;List height of each content whose class is class IV.&quot; 2 .23 Sentences with quantifier, conjunction and relative clause, e.g., &quot;List length, width and height of each content whose class is ammunition.&quot; 2 .23 Sentences with quantifiers and comparator, e.g., &quot;How many ships have a beam greater than 1000?&quot; 3 .34 The dominance of simple sentences is striking. The reason is certainly not the lack of availability of complex sentences. I think that several reasons account for this. The problem solving situation influences the subjects to work in a simple manner, often employing what I have termed success strategy, i.e., repetition of the same type of requests. Another reason is definitions. Once the subject has introduced a definition whose right hand side is often complex, involving conjunctions, relative clauses, even quantifiers, they are used in subsequent requests, which are therefore short and simple. Another reason may be simply the computer. As Robinson 13 and Grosz 14 noted, subjects tend to be more formal in conversat ion with the computer.</Paragraph> <Paragraph position="3"> Sentences were also analysed as to their type, since it was noticed that a great number of them were of the W~l-type and contained be-verbs, e.g., &quot;What are ships?&quot;. The results confirmed the observation: 75% were WH-type questions.</Paragraph> <Paragraph position="4"> Only I% were Yes-No type questions, e.g., &quot;Is Alamo a ship?&quot;, &quot;Is there a deck whose primary use is ammunition and whose length is 396?&quot;. Commands, most commonly starting wit}\] &quot;List&quot;, accounted for 19% of sentences, and a special category of statements, data addition, for the remaining 5%. These results are very interesting but I hesitate to effer an explanation. In the analysis of two F-F protocols consisting of 15500 words it was found that a be-verb occurred once every two sentences. Since be-verbs are so common also in F-F, this may either be a general feature of English or oF. the type of conversations in such problem solving tasks.</Paragraph> <Paragraph position="5"> Concerning the occurrence of other verbs, few sentences contained HAVE-verbs. No other verbs were part of the version of the grammar available to the subjects. Verbs could have been introduced by definition, but nobody did so. Possessives and sentences with &quot;there&quot; were observed, but surprisingly few in view of the availability of these structures in the grammar. The use of the article &quot;the&quot; was erratic. The investigation of the F-F sample also showed few relative pronouns; &quot;that&quot; was the most common -one in every 19 sentences. Conjunctions were fairly freqnent -- one in every 8 sentences, &quot;and&quot; being the dominant one; likewise quantiflets -- one in every 10 sentences. This coincides well with the sentence analysis for H-C where sentences with conjunctions or quantifiers are the highest in percentage among the complex ones.</Paragraph> <Paragraph position="6"> On the whole, one is forced to conclude that monotony of structure is the rule rather than the exception in H-C.</Paragraph> </Section> <Section position="5" start_page="196" end_page="197" type="metho"> <SectionTitle> 4. Definitions~ Fragments and Phatics </SectionTitle> <Paragraph position="0"> The REL System allows the user to avail himself of a great variety of definitions 6 which, however, is not too well reflected in the protocols, due to the subjects&quot; lack of familiarity with the system. One subject whom I observed as \]laving familiarized himself with the system made extensive use of definitions. It should be added that, beyond those which were actually used, 30 more definitions were attempted but contained errors. Some definitions had been built in by the language designer, notably &quot;remaining area&quot; and &quot;adjusted remaining area.&quot; These were frequently employed.</Paragraph> <Paragraph position="1"> I have made a rough categorization of the definitions according to their complexity. Abbreviations are the simplest, e.g., &quot;def:DKS:decks of the USS Alamo&quot;. But even abbreviations can be sophisticated and therefore more useful like the following one with a quantifier: &quot;def:ED:each deck of the Alamo.&quot; Abbreviations accounted for 34% of the total of 53 definitions. Synonyms were more complex: &quot;def:INFl:aft width and forward width and minimum clearance,&quot; &quot;def : INF2: INF 1 and square foot capacity,&quot; &quot;def:&quot;well deck&quot; info:INF2 of the &quot;well deck&quot; of the Alamo.&quot; Synonyms accounted for half (51%) of the definitions. Of the remainder, 9% involved arithmet teal operations, e.g., &quot;def :size: (length*width)/144&quot;, &quot;def:g(&quot;8&quot;,&quot;9&quot;):&quot;8&quot;*&quot;8&quot;+&quot;9&quot;*&quot;9&quot;. A few definllions had to do with adding new data.</Paragraph> <Paragraph position="2"> Other than definitions, fragments were of two types: parsed, which were Terse Question and Terse Reply, and nonparsed, which were False Starts and Phatics. TQs were noun phrases which are parsed into sentences if followed by a question mark, e.g., &quot;Class of culvert?&quot;, &quot;i2*(SQ of MEZ)/(450/12)?&quot; There are 67 of those. TRs were single words or numbers arrising from the partlcular feature provided by the system to deal with long answers. It reads, e.g., &quot;There are 203 lines in this answer. How many do you want? Respond with &quot;all', &quot;none&quot; or a number.&quot; It was considered important to include them, since failure to respond resulted in an error message, and also to see to what extent that feature is useful; it is, since there were 91 TRs. No distinction was made between False Start and Truncated; in all cases, these 30 oceurences were messages abandoned by the sub-ject for reasons that are seldom identifiable.</Paragraph> <Paragraph position="3"> A typing error may have been noticed or a thought changed, e.g., H: &quot;What are the decks and primary uss&quot; C: &quot;Input Error&quot; H: &quot;what are the primary uses of each deck of the Alamo?&quot; What is surprising about fragments is the paucity of TQs. They are handled by the system very well and are certainly shorter to type. ! think that the reasons again are lack o~ familiarity with the system and more formal style on the part of the subject. But it is also possible that such elliptical structures are somehow more difficult to use, which would confirm transformational theory, but poses an uncomfortable question as to the desirability (widely assumed) of ellipsis In computatlonal interaction. null Phatlcs are very peculiar in these H-C protocols. What ts striking is the anthropomorphisation of the computer. This may be due to the background of the subjects, Caltech. They clearly also serve the function of venting one's emotions, and that may be useful. They are illustrated in Section III and number 46.</Paragraph> </Section> class="xml-element"></Paper>