File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/88/c88-2097_abstr.xml
Size: 20,760 bytes
Last Modified: 2025-10-06 13:46:35
<?xml version="1.0" standalone="yes"?> <Paper uid="C88-2097"> <Title>A Parser based on Connectionist Model</Title> <Section position="2" start_page="0" end_page="456" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper proposes a parser based fully upon the conneet:i.oni.st modeL(ca\]led &quot;CM parser&quot; hereafter). In order to realize L~ile CM purser, we use Sigma-Pi-Units to implement ~ constraint of grammatical category order or word order, and a copy mechanism of suh~parse trees. Further more, we suppose there exist weak suppressive connection lisks between every pair of CM units. By these suppressive links, our CM parser explains why garden path sentences and/or deeply nested sentences are hard to recognize. Our CM parser also explains the preference principles for syntact:\[cally ambJ guotls sentences.</Paragraph> <Paragraph position="1"> I. Introduction In order to make clear a human parsing mechanism for natural language sentences, there remain some phenomena that are difficult to be explained by one .integrated principl, e. These phenomena include cognitive difficulties to recognize garden path sentences or deeply nested sentences, and preference of structurally ambiguous sentences. All the parsing mechanisms proposed so far, for instances the top-down parsJngs /Pereira \]980/, the left corner parsing /.Johnson-.Laired ~983/, Marcus's parsing model/Marcus \]980/, Shieher's shift~rednce parser /Shleber 1983/, and so on, have not yet sncceeded to explain all of these phenomena under one simple integrated principle. Note that all of them are based on symbol manJ pu\]atien paradigm.</Paragraph> <Paragraph position="2"> Recently a connectionist model ( called CM hereafter ) approach has been noticed in many area of cognitive science including hatura\], language recognitiondeg This approach has some advantages that the symbol manipulation approaches do not have. One advantage is that it is easy to use not only syntactic informations but also semantic and/or contextual informations in a uniform manner /Reilly 19~4/. One fruitful result of this approach is the explanation about recognition of semantic garden path sentences like &quot;The astronomer married the star&quot; /Waltz 1985/. Another advantage is as follows. Since the connectionist model is a parallel system without any central, controller I and an activation level of each unit and a connectlon strength between units may be presented as continuous values\] it alludes much more flexible approaches than symbol manipulation approaches do. And we also expect it can simulate some aspects of human mental processing of sentence parsing.</Paragraph> <Paragraph position="3"> This paper is concerned with the second advantage in parsing. The paper proposes a CM parser which can explain the above mentioned phenomena as preferences etco in one integrated principle.</Paragraph> <Paragraph position="4"> 2. Parser based on conneetionist model Here we omit the technical details of the CM /MeCle\].land&Rumelhart 1986/, but we must make clear that we stand for the so called &quot;localist&quot; view in which one symbol corresponds to one unit. Tberefore in our CM parser, syntactical categories like noun phrase are represented by a unit in the CM, and a parse tree is represented as a network in which suitable syntactical categories being activated are connected. In order to realize a CM parser, we have to make clear the following two problems: (1) How to express a word order or a syntactical categories order appearing in phrase structure rules. For example~ in a rule S -~ NP VP, NP must precede VP.</Paragraph> <Paragraph position="5"> (2) How to represent a ease when a parse tree is generated by recursive phrase structure rules.</Paragraph> <Paragraph position="6"> Consider rules as follows: S -~ NP VP, NP ---9 NP S and --> Comp S. The same pattern, in this case a pattern corresponding S-~ NP VP, may appear more than once :in a parse tree of one sentence. In order to represent this case, we need a copy mechanism of a partial parse tree pattern corresponding to the phrase structure rule in a connection network. Otherwise we have to prepare infinite number of copies of a partial parse tree pattern in advancedeg Of coarse this preparation is non-realistic not on computer hardware bat on human we\]ware. In Fauty's CM parser mentioned in /MeClelland&Kawamoto 1986/, the length of sentence is limited because of the above described preparation.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Phrase structure sub-network </SectionTitle> <Paragraph position="0"> Consider the next rule.</Paragraph> <Paragraph position="2"> This rule has at least two meanings. One is that: the category C consists of the category A and the category B. Another is that'\]he category B follows the category A. This meaning is concerned directly with the problem (i). To represent a case that a word is coincident with some syntactic category, we modify (3) as follows.</Paragraph> <Paragraph position="3"> C -9 word Since this rule is one variant of rule of type(3), we study about only rules of type (3) hereafter. We will. explain about a sub-network that corresponds to the phrase structure rule (3).</Paragraph> <Paragraph position="4"> We solve the problem (i) by introducing a trigger link that is presented as .-~-~ in figures. Namely &quot; A ~.t > B&quot; expresses that B follows A. From the viewpoint of the CMp the meaning of this trigger link is that the unit for category B ( called &quot;B unit&quot; hereafter) can be activated only when the unit for category A (called &quot;A unit&quot; hereafter) is fully activated. Due to the trigger link, the A unit must he activated chronologically faster than the B unit.</Paragraph> <Paragraph position="5"> The trigger link is realized by a Sigma-Pi-Unit /McClelland & Rumelhart 1986/ that includes a multiply operation. Figure 1 shows a concept of Sigma-Pi-Unit in the CM.</Paragraph> <Paragraph position="6"> Figure 1. Sigma-Pi-Unit In Figure 1, B and C are CM units. They send outputs whose values are fb and fc expressed as pos:itive values , to the A unit. These values are corresponding to the B and C unit's activation levels respectively. WIA is a weight of link from B and C to A. The input to the A unit is as follows.</Paragraph> <Paragraph position="7"> WlA*fb*fe If the B unit's activation level:fb=0~ then the C unit's activation level, does not transmit to the A unit at all. \]:n other words, the B(or C) unit~s activation level is an on-off switch for actJ vation transmission from the C(or B) unit to the A unit~ Using Sigma-Pi-Units, a sub-network of phrase structure rule (3) is represented as shown in Figure 2. The weight WA> B is very small in this case, but note that it depends on some semantic informationdeg This network will be presented in a simpler form using a trigger link &quot; A -~-~ B&quot; hereafter as shown in Figure 3. A-, B-, and C-connectors' structures appeared in Figure 3 are explained in Section 2.3deg C~conuector A.-connector B-connector network is copied to the programmable sub-networks via the connection activation system. In order to implement a copying mechanism of phrase structure rules in the form of C -~ A B , we use three CID mechanisms. They are for bidirectional connections between the A unit and the C unit, between the B unit and the C unit, and between the A unit and the B unit respectively. We omit the further details because of the limited paper space.</Paragraph> <Paragraph position="8"> Central network Connection Activation System</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Copying sub-network </SectionTitle> <Paragraph position="0"> Our final goal is to make clear a mechanism of building a parse tree \]for a whole sentence by connecting sub-networks. For this purpose, the simplest method is preparing parse trees of all the possible sentence structures. \]in principle this method is not possible, because there are infinite number of possible sentence structures. Other method is preparing a number of copies of a sub-network for each phrase structure rule in advance. For example~ ten sub~-uetworks of S -~ NP VP, ten sub-networks of VP -~ V NP, and so on. When a parser reads a sentence, it selects some sub-networks from these prepared set of sub-networks, and connects them to make a parse tree of the input sentence. This method seems to work well and solves the above mentioned problem (2). Unfortunately this method has a serious deficiency as follows. From the view point of learning in the CM, all the weights of connection links of sub-networks are learned by parsing or recognizing a number of sentences. It is a plausible hypothesis that once a human becomes to be able to parse some structure of sentence, he/she ever can parse that structure since then. In order to explain this hypothesis, the above mentioned weights learning must be uniformly done for all copies of sub-networks of the same phrase structure rule. But this uniformly learning is too artificial for the human mental learning processes.</Paragraph> <Paragraph position="1"> A solution avoiding these difficulties is as follows. There is only one central sub-network for one phrase structure rule, and all learning processes are done on it. In parsing, when a parser needs a sub-network of some rule, the parser makes copies of the sub-network and connects them into a suitable place of a parse tree yet to be constructed.</Paragraph> <Paragraph position="2"> A sub-network copying mechanism is implemented as an application of the connection information distribution (CID) mechanism /McClelland 1986/.</Paragraph> <Paragraph position="3"> Figure 4 is a simple example of copying. The programmable sub-networks are implemented with the Sigma-Pi-Units. There are a lot of yet to be progra~ned programmable sub-networks, namely blank sub--networks. When the input comes in, the corresponding connection pattern of the central</Paragraph> </Section> <Section position="3" start_page="0" end_page="455" type="sub_section"> <SectionTitle> 2.3 Connecting sub-networks </SectionTitle> <Paragraph position="0"> To generate a parse tree, we need a mechanism of generating connection links dynamically.</Paragraph> <Paragraph position="1"> Unfortunately the CM ham not yet had this mechanism. Instead of this mechanism, we use a connector that changes connection dynamically by SJgma-Pi-Un\]ts.</Paragraph> <Paragraph position="2"> There are three kinds of connector, namely A-, B-, and C-connector as shown in Figure 3o We will explain these connectors' functions in this sectiondeg C-connector : If a C unit of a sub-network is activated, the C-connector sends requests for connection to A-connectors of blank sub-networks or B-connectors whose sub-network's B unit is the same syntactical category as the sender sub-network's C unit's syntactical category. More than one connections may be established by these requests, however, they suppress each other, and at last the connection from the most strongly activated B'un:Lt wins. Even if a C unit is not so strongly activated, the C-connector sends these requests. Before a human has read a whole sentence, or even if he/she reads only few words, he/she predicts a complete or fairly large part of parse tree of possible sentence, This is why we adopt this low threshold strategy of requests sending.</Paragraph> <Paragraph position="3"> A--connector : When an A-connector receives a request for connection from the other sub-network's Cconnector, if the A-connector has not yet received any other requests for connecting, the A-connector makes a copy of sub-network whose A unit's syntactic category is the same as the syntactic category of C unit of the sender sub-network. By this copying, a parse tree grows in bottom-up manner.</Paragraph> <Paragraph position="4"> B-connector : When a B-connector receives a request for connection from the other sub-network's Cconnector, if the B unit's syntactic category is the same as the sender sub-network's C unit's syntactic category, a connection between the sender's C-connector and the receiver's B-connector is established. If more than one connections are establisiled, they suppress each other. Finally the most strongly activated connection inhibits other connections. This suppressive or exclusive connections are expressed as \[ X Y \] shown in figures~ \]in this expression, connections between X and Y are mutually suppressive or exclusivedeg The above described connectors structure are shown in Figure 5,6 and 7 respectivelydeg</Paragraph> </Section> <Section position="4" start_page="455" end_page="456" type="sub_section"> <SectionTitle> 2.4 Parsing on the CM parser </SectionTitle> <Paragraph position="0"> To summarize the above described CM parser, we sketch a parsing process of a sentence '!I eat apples.&quot; Phrase structure rules used in this example are as follows. S -9 N VP and VP -9 V N.</Paragraph> <Paragraph position="2"> (I) The CM parser reads &quot;I&quot; , and a unit for category N is activated.</Paragraph> <Paragraph position="3"> (2) The C-connector of the N unit sends a request for connection to an A-connector of the currently usable blank sub-network.</Paragraph> <Paragraph position="4"> (3) When an A-connector receives the request, it makes a copy sub-network of S -9 N VP. Since the N unit of the copied sub-network is fully activated, the trigger link from the N unit to the VP unit becomes active.</Paragraph> <Paragraph position="5"> (4) Tile CM parser reads &quot;eats&quot;, and a unit for category V is activated, and a request for connection is sent from its C-connector to some A~connector. (5) When an A-connector receives this request, it makes a copy sub-network of VP -9 V N. Not only the V unit but also the VP unit is activated. Of course the trigger link from the V unit to the N unit is activated.</Paragraph> <Paragraph position="6"> (6) The VP unit sends a request for connection via its C-connector. This request is received by the B- null connector of the previously copied sub-network for the phrase structure rule S -~ N VP, because this sub-network's B unit's category is VP, and the sender sub~network's C unit's category is also VP and triggered as you see at stage (3).</Paragraph> <Paragraph position="7"> (7) The CM parser reads &quot;apples&quot;, and a unit fo~ category N is activated, and a request for connection is seat from its C-connector.</Paragraph> <Paragraph position="8"> (8) This request is received by the B-connector of the copied sub-network at(5). This activates the C unit of this sub-network whose category is VP. This activation causes that the B unit of the sub-network of S -9 N VP. Finally;. its C unit whose category is S becomes fully activated, namely the sentence is recognized and the parse tree is accomplisheddeg The result parse tree is shown in Figure 8. For compact expressions, the A- B- and C-connectors are omitted in the rest of the paper.</Paragraph> <Paragraph position="10"> Intuitionally, our CM parser is a parallel\[ \].eft corner parser. Speaking more precisely, owing to use a trigger link which predicts syntactic categories of the next incoming word, Our CM parser is regarded as a parallel left corner parser with a continuous activation level for each generated nonterminal symbolrepresentingsomesyntacticcategory.</Paragraph> <Paragraph position="11"> 3. Control on resource bounded condition It is well known that a human memory system consists of at least two levels namely the short term memory and the long term memory respectively. A capacity of short term memory is limited to 7 4~ 2 chunks. In the CM, an implementation of short term memory has not yet been cleared. But intuitionally, the sum of all units' activation level is bounded.</Paragraph> <Paragraph position="12"> We implement this bound by the almost equivalent mechanism as follows. Namely there exist weak suppressive connection links between every pairs of units. Owing to this limitation, even if our CM parser is parallel one, it is impossible in parsing to maintain all possible candidate parse trees. Since our parser is based on the CM, the most promising parse tree is the most strongly activated one. Other parse trees are suppressed by the most promising one through the suppresszve or the exclusive connections described in Section 2.3. In the rest of the paper, we propose explanations for control mechanisms of the CM parser especially about parsings of deeply nested sentences, garden path sentences and preferences of syntactically ambiguous sentences, 4. Recognition of deeply nested sentences Our CM parser can explain why deeply nested sentences like &quot;The man who the girl who the dog chased liked laughed&quot; are hard to recognize for us human. Figure 9 shows a network being built just after the CM parser reads &quot;The mall who the girl who the dog chased&quot;. Here, since the NP 3 unit is strongly activated, the VP2/NP unit is strongly predicted and it is the right prediction. But since the NP 1 unit and the S unit are also activated, the VP 1 unit is also predicted. Therefore when the CM parser reads &quot;liked&quot;, it is not very easy to select the VP2/NP unit definitely. As seen in this example, when the CM parser reads a word at the deeply nested level, there may be a case that more than one units are strongly activated and predicted, If they have nearly the same activation level, it is not easy to select the right unit. Th:~s is one possible explanation why it is bard for us human to recognize deeply nested sentences, if the CM is a plausible model of the human mental If there are more than one possible syntactic structures for the input sentence, the CM parser makes more than one parse tree networks corresponding to them in a parsing process. If one of them is much more strongly activated than others, the parser easily ~e\]ects it as the right network. But more than one networks are often activated to almost the same \]evel. \[n the case, how to select one of them depends on many factors, for instance a contextual or a semantic inforl,ationdeg There is a worse case as follows_ Assume that a parser reads some words of the sentence, and there are more than one parse trees.</Paragraph> <Paragraph position="13"> One of them has the highest activation level than others at that time. But when the parser reads the next word, if the highest parse tree turns out to be syntactically impossible, some weakly activated parse tree is forced to be activated to the highest level suddenly. This forced sudden change of the activation level may cause us human a difficulty to recognize the sentence. This is an informal explanation for cognitive difficulty of recognizing garden path sentencE, s.</Paragraph> <Paragraph position="14"> \]n order to explain what parse tree is chosen, we have to recognize which exclusive connection plays the main role of preference between possible parse trees. Without loss of generality, it is sufficient to explain how one of two parse trees is chosen. In short, this choice point is such that an upper part of tree from this point is common to the both trees, and a part of trees that are below this choice point are different. Figure i0 shows a network generated for a garden path sentence &quot;The cotton clothing is made of grows in Mississipi.&quot; The wrong parse tree including the S~ unit is preferred while our CM parser reads &quot;T~e cotton clothing is made of&quot; , because in the phrase structure rule ~ -~ S/Np, the connect\[on link from the S unit to the ~nit is weak, and &quot;clothing&quot; is NP. But when the CM parser reads &quot;i~rows&quot; , the wrong parse tree including the S a unit is rejected synPSactically, and the right but weakly predicted VP. unit must be connected the VP unit for &quot;grows&quot;. ~ybe humans feel cognitive difficulty at that time. Note that although our CM parser should do a lot of works to parse a garden path sentence~ namely the forced sudden change of activation levels , finally it succeeds to parse the garden path sentence as well as human. It is a main difference of performance between our CM parser and Shieber's shift reduce parser.</Paragraph> <Paragraph position="16"> The cotton clothin~ is made of rf~ Figure lO. The parse tree network just after &quot;The cotton clothing is made of grows&quot;</Paragraph> </Section> </Section> class="xml-element"></Paper>