File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/83/p83-1017_metho.xml

Size: 22,039 bytes

Last Modified: 2025-10-06 14:11:36

<?xml version="1.0" standalone="yes"?>
<Paper uid="P83-1017">
  <Title>Sentence Disambiguation by a Shift-Reduce Parsing Technique*</Title>
  <Section position="2" start_page="0" end_page="113" type="metho">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> For natural language processing systems to be useful, they must assign the same interpretation to a given sentence that a native speaker would, since that is precisely the behavior users will expect.. Consider, for example, the case of ambiguous sentences. Native speakers of English show definite and consistent preferences for certain readings of syntactically ambiguous sentences \[Kimball, 1973, Frazier and Fodor, 1978, Ford et aL, 1982\].</Paragraph>
    <Paragraph position="1"> A user of a natural-language-processing system would naturally expect, it to reflect the same preferences. Thus, such systems must model in some way the lineuistie performance as well as the linguistic competence of the native speaker.</Paragraph>
    <Paragraph position="2"> This idea is certainly not new in the artificial-intelligence literature. The pioneering work of Marcus \[Marcus, 1980\] is perhaps the best. known example of linguistic-performance modeling in AI. Starting from the hypothesis that ~deterministic&amp;quot; parsing of English is possible, he demonstrated that certain performance &amp;quot;This research was supported by the Defense Advanced Research Proiects Agency under Contract NOOO39-80-C-0575 with the Naval Electronic Systems Command. The views and conclusions contained in this document are those of the author and should not be interpreted a.s representative of the oh~cial policies, either expressed or implied, of the Defense Advanced Research Projects Agency or the United States government.</Paragraph>
    <Paragraph position="3"> constraints, e.g., the difl\]culty of parsing garden-path sentences, could be modeled. His claim about deterministic parsing was quite strong. Not only was the behavior of the parser required to be deterministic, but, as Marcus claimed, The interpreter cannot use some general rule to take a nondeterministic grammar specification and impose arbitrary constraints to convert it to a deterministic specification {unless, of course, there is a general rule which will always lead to the correct decision in such a case). \[Marcus, 1980, p.14\] We have developed and implemented a parsing system that. given a nondeterministic grammar, forces disambiguation in just the manner Marcus rejected (i.e. t .hrough general rules}; it thereby exhibits the same preference behavior that psycbolinguists have attributed to native speakers of English for a certain range of ambiguities. These include structural ambiguities \[Frazier and Fodor, 1978, Frazier and Fodor, 1980, Wanner, 1980l and lexical preferences \[Ford et aL, 1982l, as well as the garden-path sentences as a side effect. The parsing system is based on the shih.-reduee scheduling technique of Pereira \[forthcoming\].</Paragraph>
    <Paragraph position="4"> Our parsing algorithm is a slight variant of LALR{ 1) parsing, and, as such, exhibits the three conditions postulated by Marcus for a deterministic mechanism: it is data-driven, reflects expectations, and has look-ahead. Like Marcus's parser, our parsing system is deterministic. Unlike Marcus's parser, the grammars used by our parser can be ambiguous.</Paragraph>
    <Paragraph position="5"> 2. The Phenomena to be Modeled The parsing system was designed to manifest preferences among ,~tructurally distinct parses of ambiguous sentences. It, does this by building just one parse tree--rather than building multiple parse trees and choosing among them. Like the Marcus parsing system, ours does not do disambiguation requiring &amp;quot;extensive semantic processing,&amp;quot; hut, in contrast to Marcus, it does handle such phenomena as PP-attachment insofar as there exist a priori preferences for one attachment over another.</Paragraph>
    <Paragraph position="6"> By a priori we mean preferences that are exhibited in contexts where pragmatic or plausibility considerations do not tend to favor one reading over the other. Rather than make such value judgments ourselves, we defer to the psycholinguistic literature {specifically \[Frazier and Fodor, 1978\], \[Frazier and Fodor, 1980\] and \[Ford et al., 1982\]) for our examples.</Paragraph>
    <Paragraph position="7">  The parsing system models the following phenomena:</Paragraph>
    <Section position="1" start_page="113" end_page="113" type="sub_section">
      <SectionTitle>
Right Association
</SectionTitle>
      <Paragraph position="0"> Native speakers of English tend to prefer readings in which constituents are &amp;quot;attached low.&amp;quot; For instance, in the sentence null Joe bought the book that I hod been trving to obtain for ~usan.</Paragraph>
      <Paragraph position="1"> the preferred reaL~lng is one in w~lch the prepositional phrase &amp;quot;for Susan ~ is associated with %o obtain ~ rather than %ought. ~</Paragraph>
    </Section>
    <Section position="2" start_page="113" end_page="113" type="sub_section">
      <SectionTitle>
Minlmal Attachment
</SectionTitle>
      <Paragraph position="0"> On the other hand, higher attachment in preferred in eerrain cases such as Joe bought the book \[or Suean.</Paragraph>
      <Paragraph position="1"> in which &amp;quot;for Susan* modifies %he book&amp;quot; rather than &amp;quot;bought.&amp;quot; Frazier and Fodor \[1978\] note that these are canes in which the higher attachment includes fewer nodes in the parse tree. Ore&amp;quot; analysis is somewhat different.</Paragraph>
    </Section>
    <Section position="3" start_page="113" end_page="113" type="sub_section">
      <SectionTitle>
Lexical Preference
</SectionTitle>
      <Paragraph position="0"> Ford et al. \[10821 present evidence that attachment preferences depend on lexical choice. Thus, the preferred reading for The woman wanted the dresm on that rock.</Paragraph>
      <Paragraph position="1"> has low attachment of the PP, whereas The tnoman positioned the dreu on that rack.</Paragraph>
      <Paragraph position="2"> has high attachment.</Paragraph>
    </Section>
    <Section position="4" start_page="113" end_page="113" type="sub_section">
      <SectionTitle>
Garden-Path Sentences
</SectionTitle>
      <Paragraph position="0"> Grammatical sentences such as The horse raced pamt the barn fell.</Paragraph>
      <Paragraph position="1"> seem actually to receive no parse by the native speaker until some sort of &amp;quot;conscioun parsing&amp;quot; is done. Following Marcus \[Marcus, 1980\], we take this to be a hard failure of the human sentence-processing mechanism.</Paragraph>
      <Paragraph position="2"> It will be seen that all these phenomena axe handled in oux parser by the same general rules. The simple context-free grammar used t (see Appendix I) allows both parses of the ambiguous sentences as well as one for the garden-path sentences. The parser disambiguates the grammar and yields only the preferred structure. The actual output of the parsing system can be found in Appendix II.</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="113" end_page="116" type="metho">
    <SectionTitle>
3. The Parsing System
</SectionTitle>
    <Paragraph position="0"> The parsing system we use is a shift-reduce purser. Shift-reduce parsers \[Aho and Johnson, 19741 axe a very general class of bottom-up parsers characterized by the following architecture.</Paragraph>
    <Paragraph position="1"> They incorporate a stock for holding constituents built up during IWe make no claims a4 to the accuracy of the sample grammar. It is obviously a gross simplific~t.ion of English syntax. Ins role is merely to show that the parsing system is sble to dis,~mbiguate the sentences under consideration correctly.</Paragraph>
    <Paragraph position="2"> the parse and a shift-reduce table for guiding the parse, At each step in the parse, the table is used for deciding between two basic types of operations: the shift operation, which adds the next word in the sentence (with its pretcrminal category) to the top of the stack, and the reduce operation, which removes several elements from the top of the stack and replaces them with a new element--for instance, removing an NP and a VP from the top of the stack and replacing them with an S. The state of the parser is also updated in accordance with the shift-reduce table at each stage. The combination of the stack, input, and state of the parser will be called a configuration and will be notated as, for example, 1 NPv IIMar, 110 1 where the stack contains the nonterminals NP and V, the input contains the lexical item Mary and the parser is in state 10.</Paragraph>
    <Paragraph position="3"> By way of example, we demonstrate the operation of the parser (using the grammar of Appendix I) on the oft-cited sentence &amp;quot;John loves Mary. ~ Initially the stack is empty and no input has been consumed. The parser begins in state 0.</Paragraph>
    <Paragraph position="4"> II ahn 10.. Mar, i0 i As elements are shifted to the stack, they axe replaced by their preterminal category.&amp;quot; T.he shiR-reduce table for the grammar of Appendix I states that in state 0, with a proper noun as the next word in the input, the appropriate action is a shift. The new configuration, therefore, is i PNOUN lo~e8 Mar~l i 4 ! The next operation specified is a reduction of the proper noun to a noun phrase yielding , NP iI loves Mary \[2 i The verb and second proper noun axe now shifted, in accordance with the shift-reduce table, exhausting the input, and the proper noun is then reduced to an NP.</Paragraph>
    <Paragraph position="6"> Finally, the verb and noun phrase on the top of the stack are reduced to a VP</Paragraph>
    <Paragraph position="8"> This final configuration is an accepting configuration, since all 2But see Section 3.'2. for an exception.</Paragraph>
    <Paragraph position="9">  the input has been consumed and an S derived. Thus the sentence is grammatical ia the grammar of Appendix I, as expected.</Paragraph>
    <Section position="1" start_page="114" end_page="114" type="sub_section">
      <SectionTitle>
3.1 Differences from the Standard LR Techniques
</SectionTitle>
      <Paragraph position="0"> The shift-reduce table mentioned above is generated automatically from a context-free grammar by the standard algorithm \[Aho and Johnson, 1974\]. The parsing alogrithm differs, however, from the standard LALR(1) parsing algorithm in two ways. First, instead of assigning preterminal symbols to words as they are shifted, the algorithm allows the assignment to be delayed if the word is ambiguous among preterminals. When the word is used in a reduction, the appropriate preterminal is assigned.</Paragraph>
      <Paragraph position="1"> Second, and most importantly, since true LR parsers exist only for unambiguous grammars, the normal algorithm for deriving LALR(1) shift-reduce tables yields a table that may specify conflicting actions under certain configurations. It is through the choice made from the options in a conflict that the preference behavior we desire is engendered.</Paragraph>
    </Section>
    <Section position="2" start_page="114" end_page="114" type="sub_section">
      <SectionTitle>
3.2 Preterminal Delaying
</SectionTitle>
      <Paragraph position="0"> One key advantage of shift-reduce parsing that is critical in our system is the fact that decisions about the structure to be assigned to a phrase are postponed as long as possible. In keeping with this general principle, we extend the algorithm to allow the ~ssignment of a preterminal category to a lexical item to be deferred until a decision is forced upon it, so to speak, by aa encompassing reduction. For instance, we would not want to decide on the preterminal category of the word &amp;quot;that,&amp;quot; which can serve as either a determiner (DET) or complementizer (THAT), until some further information is available. Consider the sentences That problem i* important.</Paragraph>
      <Paragraph position="1"> That problema are difficult to naive ia important.</Paragraph>
      <Paragraph position="2"> Instead of a.~signiag a preterminal to ~that,&amp;quot; we leave open the possibility of assigning either DET or THAT until the first reduction that involves the word. In the first case, this reduction will be by the rule NP ~DET NOM, thus forcing, once and for all, the assignment of DET as preterminal. In the second ease, the DET NOM analysis is disallowed oa the basis of number agreement, so that the first applicable reduction is the COMPS reduction to S, forcing the assignment of THAT as preterminal.</Paragraph>
      <Paragraph position="3"> Of course, the question arises as to what state the parser goes into after shitting the lexical item ~that.&amp;quot; The answer is quite straightforward, though its interpretation t,i~ d t,,a the determinism hypothesis is subtle. The simple answer is that the parser enters into a state corresponding to the union of the states entered upon shifting a DET and upon shifting a THAT respectively, in much the same way as the deterministic simulation of a nondeterministic finite automaton enters a ~uniou&amp;quot; state when faced with a nondeterministic choice. Are we then merely simulating a aoadeterministic machine here. ~ The anss~er is equivocal. Although the implementation acts as a simulator for a nondeterministic machine, the nondeterminism is a priori bounded, given a particular grammar and lexicon. 3 Thus. the nondeterminism could be traded in for a larger, albeit still finite, set of states, unlike the nondeterminism found in other parsing algorithms. Another way of looking at the situation is to note that there is no observable property of the algorithm that would distinguish the operation of the parser from a deterministic one. In some sense, there is no interesting difference between the limited nondeterminism of this parser, and Marcus's notion of strict determinism. In fact, the implementation of Marcus's parser also embodies a bounded nondeterminism in much the same way this parser does.</Paragraph>
      <Paragraph position="4"> The differentiating property between this parser and that of Marcus is a slightly different one, namely, the property of qaaM-real-time operation. 4 By quasi-real-time operation, Marcus means that there exists a maximum interval of parser operation for which no output can be generated. If the parser operates for longer than this, it must generate some output. For instance, the parser might be guaranteed to produce output (i.e., structure) at least every three words. However, because preterminal assignment can be delayed indefinitely in pathological grammars, there may exist sentences in such grammars for which arbitrary numbers of words need to be read before output can be produced.</Paragraph>
      <Paragraph position="5"> It is not clear whether this is a real disadvantage or not, and, if so, whether there are simple adjustments to the algorithm that would result in quasi-real-time behavior. In fact, it is a property of bottom-up parsing in general that quasi-real-time behavior is not guaranteed. Our parser has a less restrictive but similar property, fairneaH, that is, our parser generates output linear in the input, though there is no constant over which output is guaranteed. For a fuller discussion of these properties, see Pereira and Shieber \[forthcoming\].</Paragraph>
      <Paragraph position="6"> To summarize, preterminal delaying, as an intrinsic part of the algorithm, does not actually change the basic properties of the algorithm in any observable way. Note, however, that preterminal assignments, like reductions, are irrevocable once they are made {as a byproduct of the determinism of the algorithm}. Such decisions can therefore lead to garden paths, as they do for the sentences presented in Section 3.6.</Paragraph>
      <Paragraph position="7"> We now discuss the central feature of the algorithm.</Paragraph>
      <Paragraph position="8"> namely, the resolution of shift-reduce conflicts.</Paragraph>
    </Section>
    <Section position="3" start_page="114" end_page="114" type="sub_section">
      <SectionTitle>
3.3 The Disambiguation Rules
</SectionTitle>
      <Paragraph position="0"> Conflicts arise in two ways: aM/t-reduce conflicts, in which the parser has the option of either shifting a word onto the stack or reducing a set of elements on the stack to a new element; reduce-reduce conflicts, in which reductions by several grammar  rules are possible. The parser uses two rules to resolve these conflicts: 5  (I) Resolve shift-reduce conflicts by shifting. (2) Resolve reduce-reduce conflicts by performing  the longer reduction.</Paragraph>
      <Paragraph position="1"> These two rules suffice to engender the appropriate behavior in the parser for cases of right association and minimal attachment. Though we demonstrate our system primarily with PP-attachment examples, we claim that the rules are generally valid for the phenomena being modeled \[Pereira and Shieber, forthcoming\].</Paragraph>
    </Section>
    <Section position="4" start_page="114" end_page="114" type="sub_section">
      <SectionTitle>
3.4 Some Examples
</SectionTitle>
      <Paragraph position="0"> Some examples demonstrate these principles. Consider the sentence Joe took the book that I bought for Sum,re. After a certain amount of parsing has beta completed deterministically, the parser will be in the following coniigttration: I NP v that V Illdegr S,... I with a shift-reduce confict, since the V can be reduced to a VP/NP deg or the P can be shifted. The principle* presented would solve the conflict in favor of the shift, thereby leading to the following derivation:  principles to handle right association and minlmal attachment, together with the following two rules, are due to Fernando Pereira \[Pereira, 1982\[. The formalization of preterminal delaying and the extensions to the Ionic tlpreference cases and garden-path behavior are due to the author.  loosely based on the work of Gaadar \[lggl\]. The Appendix 1 grammar does not incorporate the full range of slashed rules, however, but merely a representative selection for illustrative purposes.</Paragraph>
      <Paragraph position="1"> Joe bouC/ht the book for Su,an.</Paragraph>
      <Paragraph position="2"> demonstrates resolution of a reduce-reduce conflict. At some point in the parse, the parser is in the following configuration: \[ NP V NP PP ii 120 I with a reduce-reduce conflict. Either a more complex NP or a VP can be built. The conflict is resolved in favor of the longer reduction, i.e., the VP reduction. The derivation continues:</Paragraph>
      <Paragraph position="4"> ending in an accepting state with the following generated structure: null \[sdoe{v~,bought\[Npthe bookl\[Ppfor Susan\]I\]</Paragraph>
    </Section>
    <Section position="5" start_page="114" end_page="116" type="sub_section">
      <SectionTitle>
3.5 Lexical Preference
</SectionTitle>
      <Paragraph position="0"> To handle the lexical-preferenee examples, we extend the second rule slightly. Preterminal-word pairs can be stipulated as either weak or strong. The second rule becomes (2} Resolve reduce-reduce conflicts by performing the longest reduction with the stroncest &amp;ftmost stack element. 7 Therefore, if it is assumed that the lexicon encodes the information that the triadic form of ~ant&amp;quot; iV2 in the sample grammar) and the dyadic form of ~position&amp;quot; (V1) are both weak, we can see the operation of the shift-reduce parser on the ~dress on that rack&amp;quot; sentences of Section 2. Both sentences are similar in form and will thus have a similar configuration when the reduce-reduce conflict arises. For example, the first sentence will be in the following configuration: t NP wanted NP PP i\[ 120 i In this case, the longer reduction would require assignment of the preterminat category V2 to ~ant,&amp;quot; which is the weak form: thus, the shorter reduction will be preferred, leading to the derivation:  In the ca~e in which the verb is &amp;quot;positioned,&amp;quot; however, the longer reduction does not yield the weak form of the verb; it will therefore be invoked, reslting in the structure: \[sthe woman \[vP positioned \[Npthe dress\]\[ppon that rackl\]\]</Paragraph>
    </Section>
    <Section position="6" start_page="116" end_page="116" type="sub_section">
      <SectionTitle>
3.6 Garden-Path Sentences
</SectionTitle>
      <Paragraph position="0"> As a side effect of these conflict resolution rules, certain sentences in the language of the grammar will receive no parse by the parsing system just discussed. These sentences are apparently the ones classified as &amp;quot;garden-path&amp;quot; sentences, a class that humans also have great difficulty parsing. Marcus's conjecture that such difficulty stems from a hard failure of the normal sentence-processing mechanism is directly modeled by the parsing system presented here.</Paragraph>
      <Paragraph position="1"> For instance, the sentence The horse raced past the barn fell exhibits a reduce-reduce conflict before the last word. If the participial form of &amp;quot;raced&amp;quot; is weak, the finite verb form will be chosen; consequently, &amp;quot;raced pant the barn&amp;quot; will be reduced to a VP rather than a participial phrase. The parser will fail shortly, since the correct choice of reduction was not made.</Paragraph>
      <Paragraph position="2"> Similarly, the sentence That scaly, deep-sea fish ,hould be underwater i~ important. null will fail. though grammatical. Before the word %hould&amp;quot; is shifted, a reduce-reduce conflict arises in forming an NP from either &amp;quot;That scaly, deep-sea l~h&amp;quot; or &amp;quot;scaly, deep-sea fish.&amp;quot; The longer (incorrect} reduction will be performed and the parser will fail.</Paragraph>
      <Paragraph position="3"> Other examples, e.g., &amp;quot;the boy got fat melted,&amp;quot; or &amp;quot;the prime number few&amp;quot; would be handled similarly by the parser, though the sample grammar of Appendix I does not parse them \[Pcreira and Shieber, forthcoming\].</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML