File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-0309_metho.xml
Size: 13,811 bytes
Last Modified: 2025-10-06 14:09:06
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0309"> <Title>The information-processing difficulty of incremental parsing</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 The Accessibility Hierarchy </SectionTitle> <Paragraph position="0"> This paper examines the processing predictions of the ERH on a systematic class of relative clause types, the Accessibility Hierarchy (AH) shown in figure 1. The AH is an implicational markedness hierarchy of grammatical relations discovered by Keenan and Comrie in (1977). The implication is that if a language has a relative-clause formation rule applicable to grammatical relations at some point x on the AH, then it can also form relative clauses on grammatical relations listed at all points before x.</Paragraph> <Paragraph position="1"> This hierarchy shows up in a variety of modern syntactic theories that have been influenced by Relational Grammar (Perlmutter and Postal, 1974).</Paragraph> <Paragraph position="2"> In Head-driven Phrase Structure Grammar (Pollard and Sag, 1994) the hierarchy corresponds to the order of elements on the SUBCAT list, and interacts with other principles in explanations of binding facts. The hierarchy also figures in Lexical-Functional Grammar (Bresnan, 1982) where it is known as Syntactic Rank.</Paragraph> <Paragraph position="3"> Keenan and Comrie speculated that their typological generalization might have a basis in performance factors. This idea was examined in a repetition-accuracy experiment carried out in 1974 but not published until 1987. Subjects in this study repeated back stimulus sentences after a delay while under the additional memory load of a digitmemory task. Stimuli were subject-modifying relative clauses embedded in one of four carrier sentence frames, exemplified in figure 2.</Paragraph> <Paragraph position="4"> subject extracted they had forgotten that the boy who told the story was so young direct object extracted the fact that the cat which David showed to the man likes eggs is strange indirect object extracted I know that the man who Stephen explained the accident to is kind oblique extracted he remembered that the food which Chris paid the bill for was cheap genitive subject extracted they had forgotten that the girl whose friend bought the cake was waiting genitive object extracted the fact that the sailor whose ship Jim took had one leg is important sentence types The results of the human study, given in figure 3, show that repetition accuracy1 declines across the AH. Keenan and Hawkins (1987) note however that &quot;It remains unexplained just why RCs should be more difficult to comprehend-produce as they are formed on positions lower on the AH.&quot; The ERH, if correct, would offer just such an explanation. If a person's difficulty on each word of a sentence is related to derivational information signaled by that word, then the total difficulty reading a sentence ought to be the sum of the difficulty on each word2.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Minimalist Grammars </SectionTitle> <Paragraph position="0"> If correct, the ERH would explain the increasing difficulty across the AH in terms of greater or lesser uncertainty about intermediate parser states. To calculate these predictions, some assumption must be made about what those structures are.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Two analyses of relativization </SectionTitle> <Paragraph position="0"> Toward this end, two grammars covering the Keenan and Hawkins stimuli were written in the Minimalist Grammars (Stabler, 1997) formalism.</Paragraph> <Paragraph position="1"> These grammars were exactly the same except for their treatment of relative clauses.</Paragraph> <Paragraph position="2"> One grammar expresses the usual analysis of relative clauses as right-adjoined modifiers (Chomsky, 1977). The other expresses the promotion analysis of relative clause. The analysis, which dates back to the 1960s, is revived in Kayne (1994). For reasons having to do with Kayne's general theory of phrase structure, he proposes that, in a sentence like 1, the underlying form of the subject is akin to 2.</Paragraph> <Paragraph position="3"> ity metric ERH to the sentence level. In word-by-word self-paced reading, evidence for the Accessibility Hierarchy is limited (cf. chapter 5 of Hale (2003)).</Paragraph> <Paragraph position="4"> (1) the boy who the father explained the answer to was honest (2) [IP the father explained the answer to [DP[+wh] who boy[+f] ] ] According to Kayne, at an early stage (2) of syntactic derivation, the determiner phrase (DP) &quot;who boy&quot; occupies what will eventually be the gap position. This DP moves to a specifier position of the enclosing, empty-headed (C0) complementizer phrase (CP), thereby checking a feature +wh as indicated in 3.</Paragraph> <Paragraph position="5"> (3) [CP [DP who boy[+f] ]i C0 [IP the father explained the answer to ti ] ] In a second movement, &quot;boy&quot; evacuates from DP, moving to another specifier (perhaps that of the silent agreement morpheme, Agr) as in 4 - checking a different feature, +f.</Paragraph> <Paragraph position="6"> (4) [AgrP boyj Agr [CP [DP who tj ]i C0 [IP the father explained the answer to ti ] ] ] The entire structure becomes a complement of a determiner to yield a larger DP in 5.</Paragraph> <Paragraph position="7"> (5) [DP the [AgrP boyj Agr [CP [DP who tj ]i C0 [IP the father explained the answer to ti ] ] ] ] No adjunction is used in this derivation, and, unconventionally, the leftmost &quot;the&quot; and &quot;boy&quot; do not share an exclusive common constituent. Nor is the wh-word &quot;who&quot; co-indexed with anything. Structural descriptions involving both the Kaynian analysis and the more standard adjunction analysis are shown in figures 4 and 5 respectively3. The other linguistic assumptions suggested by these diagrams are discussed in chapter 4 of Hale (2003).</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Formal grammars of relativization </SectionTitle> <Paragraph position="0"> The Minimalist Grammars (MG) formalism (cf.</Paragraph> <Paragraph position="1"> Stabler and Keenan (2003) for a systematic presentation) facilitates the relatively transparent implementation of ideas like movement and feature checking that figure prominently in the two analyses of relativization discussed in the previous subsection. MGs define a set of sentences by closing the structure-building functions merge and move on a finite set of lexical entries; however, this does not mean that parsing must happen bottom-up. A fundamental result, obtained independently by Harkema (2001) and Michaelis (2001) 3The X-bar structures depicted in figures 4 and 5 are drawn using tools developed by Edward Stabler and colleagues.</Paragraph> <Paragraph position="2"> is that MGs are equivalent to Multiple context-free grammars (Seki et al., 1991). Multiple context-free grammars generalize standard context-free grammars by allowing the string yields of daughter categories to be manipulated by a function other than simple concatenation. As in Tree Adjoining Grammar (Joshi et al., 1975) a record of these manipulations is kept at each node of an MG derivation tree, while a picture of the result is manifested in derived trees such as the ones in figures 4 and 5.</Paragraph> <Paragraph position="3"> The derivation tree on the promotion grammar is shown4 in figure 6 for the substring &quot;the boy who the father explained the answer to.&quot; The derivation trees encode everything there is to know about MG derivations, and can be parsed in a variety of orders. Most importantly, if equipped with weights on their branches, they can be gener-</Paragraph> <Paragraph position="5"/> </Section> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Procedure </SectionTitle> <Paragraph position="0"> Derivation trees on both grammars were obtained5 for each of Keenan and Hawkins' (1987) twenty-four stimulus sentences6. Branches of these derivation trees were viewed as PCFG rules with probabilities set according to the usual relative-frequency estimation technique (Chi, 1999). However, because the stimuli were intentionally constructed to have uncertainty, the results were calculated using a modified stimulus set in which four noun phrases were changed from plural to singular.</Paragraph> <Paragraph position="1"> exactly four examples of each structure, these sentences were weighted in accordance with a corpus study (Keenan, 1975) to make their relative frequencies more realistic.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 6 Results </SectionTitle> <Paragraph position="0"> The summed entropy reductions exhibit a significant correlation with the repetition accuracy scores collected by Keenan and Hawkins (1987).</Paragraph> <Paragraph position="1"> The correlation in figure 7(a) obtains only on the grammar expressing the Kaynian promotion analysis, and not on the grammar expressing the standard adjunction analysis (figure 7(b)). Nor do log-probabilities for stimulus sentences on the grammar</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> 7 Discussion </SectionTitle> <Paragraph position="0"> From the perspective of the ERH, the difference between the promotion and adjunction grammars resides in the uncertainty of particular states an incremental parser would pass through on the way to a complete analysis.</Paragraph> <Paragraph position="1"> On the Keenan and Hawkins' (1987) stimuli, these grammars specify incremental parser states that support explanations for some of the observed repetition accuracy asymmetries, abbreviated <.</Paragraph> <Paragraph position="2"> SU < IO subject extracted relatives are easier than indirect object extracted relatives, because a left-to-right incremental parser evades, in just subject extracted relatives, the uncertainty associated with questions like * which internal argument is the gap? * did dative shift happen? These questions are defined by alternative derivation-subtrees associated with the verb phrase. For the DO stimuli that use potentially ditransitive embedded verbs the same explanation is available, however only two out of four items in the Keenan and Hawkins (1987) set qualify.</Paragraph> <Paragraph position="3"> IO < OBL there is only one type of extraction from indirect object, whereas on these grammars, the head of the oblique phrase (&quot;for&quot; &quot;with&quot; &quot;on&quot; or &quot;in&quot;) signals which of four categorically separate kinds of extraction has occurred. These alternatives correspond to four different derivation-nonterminals.</Paragraph> <Paragraph position="4"> OBL < GEN both grammars analyze &quot;whose&quot; as taking a common noun argument, for example &quot;whose ship.&quot; But in just the promotion grammar, &quot;whose&quot; is further analyzed as the ordinary &quot;who&quot; morpheme plus a complex possessive phrase headed by &quot;-s&quot; (McDaniel et al., 1998). Because of the recursive character of this possessor category, the structure of &quot;whose's&quot; common noun argument introduces additional uncertainty not present in the indirect object extracted relatives.</Paragraph> <Paragraph position="5"> Strikingly, the two grammars disagree on six outliers in figure 7(b) where just the adjunction grammar predicts very great difficulty in conjunction with the ERH. These outlier predictions are made on just the sentences that use the nominal carrier frame beginning with &quot;the fact that...&quot; Because the adjunction grammar analyzes relative clauses with an MG rule analogous to the phrase structure rule (4),</Paragraph> <Paragraph position="7"> all DPs are available for modification by any number of stacked relative clauses. The nominal frame introduces an additional DP, not present in the other stimuli, that can be modified in this way.</Paragraph> <Paragraph position="8"> By contrast, the promotion grammar does not include a +f promotion feature on any lexical entry for &quot;fact,&quot; precluding the possibility of such modification. Moreover, even with such a feature, the promotion grammar assigns different categories to the outermost versus successive relative clause modifiers. Because only one relative clause is ever stacked in the Keenan and Hawkins (1987) stimulus set, the relevant recursion is not attested, yielding a category of caseless subject DP that is more certain than it is in the adjunction grammar.</Paragraph> <Paragraph position="9"> An ERH account that avoids predicting these outliers on the Keenan and Hawkins (1987) stimuli seems to require a grammar where the probability of 2nd and subsequent stacked relative clause modifiers is closer to 0 (its value on the trained promotion grammar) than to 0.31 (its value on the trained adjunction grammar). Beyond these particular stimuli, this modeling motivates a general question about the scale of structural expectations in human sentence processing. Does disconfirmation of a more complicated structural alternative (such as stacked relative clauses) induce greater processing difficulty than disconfirmation of a simpler one? Such empirical issues go beyond the scope of this paper but suggest particular kinds of future work.</Paragraph> </Section> class="xml-element"></Paper>