File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/80/p80-1043_abstr.xml

Size: 12,376 bytes

Last Modified: 2025-10-06 13:45:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="P80-1043">
  <Title>Real Reading Behavior</Title>
  <Section position="1" start_page="0" end_page="159" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> The most obvious observable activities that accompany reading are the eye fixations on various parts of the text.</Paragraph>
    <Paragraph position="1"> Our laboratory has now developed the technology for automatically measuring and recording the sequence and duration of eye fixations that readers make in a fairly natural reading situation. This paper reports on research in progress to use our observations of this real reading behavior to construct computational models of the cognitive processes involved in natural reading.</Paragraph>
    <Paragraph position="2"> In the first part of this paper we consider some constraints placed on models of human language comprehension imposed by the eye fixation data. In the second part we propose a particular model whose processing time on each word of the text is proportional to human readers' fixation durations.t</Paragraph>
    <Section position="1" start_page="0" end_page="159" type="sub_section">
      <SectionTitle>
Some Observations
</SectionTitle>
      <Paragraph position="0"> The reason that eye fixation data provide a rich base for a theoretical model of language processing is that readers' pauses on various words of a text are distinctly non-uniform.</Paragraph>
      <Paragraph position="1"> Some words are looked at very briefly, while others are gazed at for one or two seconds. The longer pauses are associated with a need for more computation \[2\]. The span of apprehension is relatively small, so that at a normal reading distance a reader cannot extract the meaning of words that are in peripheral vision \[6\]. This means that a person can read only what he looks at, and for scientific texts read normally by college students, this involves looking at almost every word. Furthermore, the longer pauses can occur immediately on the word that triggers the additional computation \[4\]. Thus it is possible to infer the degree of computational load at each point in the text.</Paragraph>
      <Paragraph position="2"> The starting point for the computer model was the analysis of the eye fixations of 14 Carnegie-Mellon undergraduates reading 15 passages (each about 140 words long) taken from the science and technology sections of Newsweek and Time magazines (see the Appendix for a sample passage).</Paragraph>
      <Paragraph position="3"> The mean fixation duration on each word (or on larger, clause-like sectors) of the text were analyzed in a multiple regression analysis in which the independent variables were the structural prcperties of the texts that were believed to affect the fixation durations. The results showed that fixation durations were influenced by several levels of processing, such as the word level (longer, less frequent  P. Sloan Foundation. the National Institute of Education (G-79-0119) and the National institute of Mental Health (MH-29617) words take longer to encode and lexically access), and the text level (more important parts of the text, like topics or definitions take longer to process than less important parts). This analysis generated a verbal description of a model of the reading process that is consistent with the observed fixation durations. The details of the data, analysis, and model are reported elsewhere \[5\].</Paragraph>
      <Paragraph position="4"> Some of the most intriguing aspects of the eye-fixation data concern trends that we have failed to find. Trends within noun phrases and verb phrases seem notable by their absence. Most approaches to sentence comprehension suggest that when the head noun of a noun phrase is reached, a great deal of processing is necessary to aggregate the meanings of the various modifiers. But this is not the case. While determiners and some prepositions are looked at more briefly, adjectives, noun-classifiers, and head nouns receive approximately the same gaze durations.</Paragraph>
      <Paragraph position="5"> (These results assume that word length effects on gaze duration have been covaried out). Verb phrases, with the exception of modals, show a similar flat distribution. It is also notable that verbs are not gazed at longer than nouns, as might be expected. Such results pose an interesting problem for a system which not only recognizes words, but also provides for their interpretation.</Paragraph>
      <Paragraph position="6"> Anotl&amp;quot;ler interesting result is the failure to find any associations with length of sentences (a rough measure of their complexity) or ordinal word position within sentences (a rough measure of amount of processing). That is to say, whether or not word function, character-length or syllables, etc., are controlled, there are no systematic trends associated with ordinal word position or sentence length.</Paragraph>
      <Paragraph position="7"> There is an added gaze duration associated with punctuation marks. Periods add about 73 milliseconds, and other punctuation (including commas, quotes, etc.) add about 43 milliseconds each above what can be accounted for by character-length or other covariates.</Paragraph>
      <Paragraph position="8"> The Framework The strategy for making sense of these and other similar observations is to develop a computational framework in which they can be understood. That framework must be capable of performing such diverse functions as word recognition, semantic and syntactic analysis, and text analysis. Furthermore, it must permit the ready interaction among processes implied by these functions. The framework we have implemented to accomplish these ambitious goals is a production system fashioned closely after Anderson's ACT system \[1\]. Such a production system is composed of three parts, a collection of productions comprising knowledge about how to carry out processes, a declarative knowledge base against which those processes are carried out, and an interpreter which provides for the actual behavior of the productions.</Paragraph>
      <Paragraph position="9">  A production written for such a system is a condition-action pair, conceptually an 'if-then' concept, where the condition is assessed against a dynamically changing declarative know~edge base. If a condition is assessed as true (or matcheLl), the action of the production is taken to alter the knowJedge base. Altering the knowledge base leads to further potential for a match, so the production system will naturally cycle from match to match until no further productions can be matched. The sense in which processing is C/otemporaneous is that all productions in memory are assessed for a match of their conditions before an action is taken, and then all productions whose.</Paragraph>
      <Paragraph position="10"> conditions succeed take action before the match proceeds again. This cycling, behavior provides a reference in establishing the basic synchrony of the system. The mapping from the behavior of the model to observed word gaze durations is on the basis of the number of match (or so-called recognition.act) cycles which the model requires to process each word.</Paragraph>
      <Paragraph position="11"> The physical implementation of the model is equipped at present to handle a dependency analysis of sentences of the sort of complexity we find in our texts (see the Appendix). There is nothing new to this analysis, and so it is not presented here. The implementation also exihibits some elementary word recognition, in that, for a few words, it contains productions recognizing letter configurations and shape parameters. The experience is, however, that the conventions which we have introduced provide a thoroughly 'debugged' initial framework. It is to the details of that framework that we now turn.</Paragraph>
      <Paragraph position="12"> Much of our initial effort in formulating such a parallel processing system has been concerned with making each processing cycle as efficient as possible with respect to the processing demands involved in reading to comprehend. To do this we allow that any number of productions can fire on e single cycle, each production contributing to the search for an interpretation of what is seen. Thus, for instance, the system may be actively working on a variety of processing tasks, and some may reach conclusion before others. The importance of concurrent processing is precisely that the reader may develop htPotheses in actively pursuing one processing avenue (such as syntax), and these hypotheses may influence other decisions (such as semantics) even before the former hypotheses are decided. Furthermore, hypotheses may be developed as expectations about words not yet seen, and these too should affect how those words are in fact seen. In effect, much of our initial effort has been in formulating how processes can interact in a collaborative effort to provide an interpretation.</Paragraph>
      <Paragraph position="13"> Collaboration in single recognition-act cycles is possible with carefully thought out conventions about the representation of knowledge in the knowledge base. As in ACT, every knowledge base element in our model is assigned a real.number activation level, which in the present system is regard d as a confidence value of sorts. Unlike ACT, the activation levels in our model are permitted to be positive or negative in sign, with the interpretation that a negative sign indicates the element is believed to be untrue. Coupled with this property of knowledge base elements are threshold properties associated with elements in the condition side of the productions. A threshold may be positive or negative, indicating a query about whether something is true or false with some confidence. As the system is used, there is a conventional threshold value above which knowledge is susceptible to being evaluated for inconsistency or contradiction, and below which knowledge is treated as hypothetical, in the examples below, this conventional threshold value is assumed. The condition elements can also include absence tests, so the system is capable of responding on the basis of the absence of an element at a desired confidence. Productions can also pick out knowledge that is only hypothetical using this device.</Paragraph>
      <Paragraph position="14"> But more importantly confidence in a result represents a manner in which productions can collaborate.</Paragraph>
      <Paragraph position="15"> The confidence values on knowledge base elements are manipulated using a special action called &lt;SPEW&gt;.</Paragraph>
      <Paragraph position="16"> Basically, this action takes the confidence in one knowledge-base element and adds a linearly weighted function of that confidence to other knowledge.base elements, If any such knowledge-base element is not, in fact, in the knowledge base, it will be added. The elements themselves can be regarded as propositions in a propositional network. Thus, one can view the function of productions as maintaining and constructing coherent fields of propositions about the text.</Paragraph>
      <Paragraph position="17"> Network representations of knowledge provide a natural indexing scheme, but to be practical on a computer such an indexing scheme needs augmentation. The indexing scheme must do several things at once. It must discriminate among the same objects used in different contexts, and it must also help resolve the difficult problem of two or more productions trying to build, or comment upon, the same knowledge structure concurrently. To give something of the flavor of the indexing scheme we have chosen: where other natural language understanding systems may create a token JOHN24 for a type JOHN, the number 24 in the present system does not simply distinquish this 'John' from others, it also places him within a dimensional space. In the exarnpies to follow the token numbers are generated for the sequential gazes, 1 for the first and so on. An obvious use of such a scheme is that several productions may establish expectations regarding the next word. If some subset of the productions establish the same expectation, then without matching they will create the properly distinguished tokens for that expectation.</Paragraph>
      <Paragraph position="18"> Consider one production written for this system:</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML