File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/87/t87-1016_abstr.xml
Size: 13,105 bytes
Last Modified: 2025-10-06 13:46:29
<?xml version="1.0" standalone="yes"?> <Paper uid="T87-1016"> <Title>Parallel Distributed Processing and Role Assignment Constraints</Title> <Section position="2" start_page="75" end_page="78" type="abstr"> <SectionTitle> (P2 I,OCATION OVEN) </SectionTitle> <Paragraph position="0"> An individual triple can be represented in distributed form by dedicating a set of units to each of its parts; thus we can have one set of units for the head of the triple, one for the relation, and one for the tail or slot-filler. Each of the three parts of a triple can then be represented in distributed form as a pallern of activation over the units. The idea of using this kind of three-part distributed representation was inlroduced by l linton (1981) to represent the contents of semantic nell; the extension to arbitrary tree structures is due to &quot;Fouretzky and l linton (1985) and Touretzky (1986).</Paragraph> <Paragraph position="1"> For the fillers, or the tail of a triple, the units stand for useful characterizers that serve to distinguish one filler from another, llinton (1981) used the term &quot;microfeatures&quot; for these units; these features need not correspond in any simple way to verbalizable primitives. Different slot fillers produce different patterns on these units; and the different possible instanliatlons of a filler are likewise captured by differences in the pattern of activation on the units.</Paragraph> <Paragraph position="2"> For the relations, fire units stand for characteristics of the relation itself Note that this differs from most other approaches in treating each role or relation as a distributed pattern. This has several virtues. For one thing, it immediately eliminates the problem of specifying a small set of case roles, in the face of the fact that there seem to he a very large number of very subtle differences between roles that are in many ways very similar. Further, the use of distributed representations allows us to capture both the similarities and differences among case roles. The idea has been proposed on independent linguistic grounds, as well.</Paragraph> <Paragraph position="3"> For the head of each triple, the units stand for characteristics of the whole in which the filler plays a part.</Paragraph> <Paragraph position="4"> Thus the pattern that represents P1 is not some arbitrary pointer as it might be in a Lisp-based rcpresentalion, but is rather a Reduced Description of the constituent that it stands for (I linton, McClelland, and Rumelhart, 1986; Lakoff, personal communication). In particular, the pattern representing P1 would capture characteristics of the act of eating and of the participants in the act. There would he less detail, of course, than in the separate representations of these constituents where they occur as separate fillers of the tail slot.</Paragraph> <Paragraph position="5"> ,~yntaclic and case-role representations. Sentences have both art augmented surface structure representation and a case-role representation, in the present model, then, there are two sets of units, one that represents the syntactic structure triples, and one that represents the case-structure triples. I have already described the general form of the case-role triples; the syntactic Iriples would have a similar form, though they would capture primarily syntactic relations among the constituents. So, for example, the set of syntactic triples of Senlence 2 would be something like: There are, correspondingly, two main parts to the model, a syntactic processor and a case-fi'ame processor (See Figure !). In this respect, the model is similar to marly conventional parsing schemes (e.g., Marcus, 1980; Kaplan and Bresnan, 19821. The microstructure is quite different, however. One of the key I.hings ttlat a PDP microstructure buys us is the ability to improve the interaction between these two main components.</Paragraph> <Paragraph position="6"> Syntactic processing. The role of the syntactic processor is to take in words as they are encountered in reading or listening and to produce at its outputs a sequence of patterns, with each pattern capturing one syntactic structure triple. ~ In Figure l the syntactic processor is shown in the midst of processing Sentence 2. It has reached the 1. Note that this means that ~everai words can be packed into the same constituent, and that as the words of a constituent (e.g., &quot;the old grey donkey&quot;) are encountered the microfeatures of the constituent wil |he gradually specified. Thus the representation of the constituent can gradually build up at the output of the syntactic processor.</Paragraph> <Paragraph position="7"> point where it is processing the words &quot;the cake&quot;. The output of at this point should tend to aclivate the pattern corresponding to (S1 I)OBJ CAKE) over a sef. of units (the syntactic triple &quot;units) whose role is to display the pattern of activation corresponding to the current syntactic triple. Note that these units also receive feedback from the case-frame processor; the role of this feedback is to fill in unspecified parts of Ihe syntactic triple, as shall be discussed below. The syntactic triple unil.s have connections to units (Ihe case-frame triple units) which serve to represent the current case-frame triple.</Paragraph> <Paragraph position="8"> The connections between these two sets of units are assumed to be learned through prior pairings of synlactic triples and case-frame triples, so that they capture the mutual constraints on case and synlactic role assignmenls. The inner workings of the syntaclic processor have yet to be fidly worked out, so for now I leave it as a black box.</Paragraph> <Paragraph position="9"> The case-frame processor. Tire role of the case-frame processor is to produce art aclive representation of the current case-frame cortstituent, based on the pattern represenling Ihe current synlaclic consliluent on the syntactic triple units and on feedback from a set of units called the working memory. Tlte working memory is Ihe slruclure iH which the developing case-frame represenlalion of Ihe sentence is held. As conslilucnts are parsed, Ihey are loaded into the working memory, by way of a network called an I/O ncl. 2 Within Ihe working memory, individual units correspond to combinations of units in the current case-role represenlation. Tbtts, Ihe represenlalion at Ihis level is conjunctive, artd is therefore capable of maintaining information about which combinations of case-role units were activated togelher in the same case-role triple when the patterns aclivated by several triples are snperimposed in the working memory (see llinton et al 1986, for discussion). Of course, early in a parse, Ihe loaded constituents will necessarily be incomplete.</Paragraph> <Paragraph position="10"> Pattern completion. The working memory provides a persisting representation of the cottstituents already parsed. This representation persists as a pattern of activation, so that it can bolh constrain and be constrained by new constituents as they are encountered, through interactions with a final set of units, called tile hidden ease-role units. These units are called &quot;hidden&quot; because their state is not visible to any olher part of the system; instead they serve to mediate constraining relations among the units in the working memory. The process works as follows.</Paragraph> <Paragraph position="11"> Connections from working memory units to hidden units allow the pattern of activation over the working memory to produce a pattern over the hidden units. Connections from the hidden units to the working memory units allow these patterns, in turn, to feed activation back to the working memory. This feedback allows the network to complete and dean-up distorted and incomplete patterns (that is, representations of sentences). The connections in the network are acquired through training on on a sample of sentences (see St. John, 1986, for details). The connection strengths derived from this training experience allow it to sustain and complete the representations of familiar sentences; this capability generalizes to novel sentences with similar structure.</Paragraph> <Paragraph position="12"> What this model can do. The model I have described should be able to do all of the kinds of things listed at the beginning of the paper. Consider, for example, the problem of interpreting the sentence &quot;The boy hit the ball with the bat.&quot; This requires both assigning the appropriate reading (baseball bat) and the appropriate role (instrument) to the bat. The syntactic triple for this constituent (S1 with-PP BAT), would tend to activate a pattern over the coresponding to a blend of baseball bat and flying bat as the tail of the triple, and a blend of the possible case-roles consistent with &quot;with&quot; as the the pattern representing the relalion portion of the triple. These in turn would tend to activate units representing the various possible filler-role combinations consistent with this syntactic constituent. But since the other constituents of the sentence w,~uld already have been stored in the working memory, the completion process would tend to support units standing for the baseball-bat as instrument interpretation more than others. Thus, simultaneous role assignment and context sensitive selection of the appropriate reading of an ambiguous word would be expected to fall out naturally from the operation of the completion process.</Paragraph> <Paragraph position="13"> Filling in default values for missing arguments and shading or shaping the representations of vaguely described constituents is also a simple by-product of the pattern completion process. Thus, fi)r example, on encountering &quot;The man stirred the coffee&quot;, the completion process will tend to fill in the paltern for the completion that includes a spoon as instrument. Note that the pattern so filled in need not specify a particular specific concept; thus for a sentence like &quot;The boy wrote his name&quot;, we would expect a pattern representing a writing inslrument, but not specifying if it is a pen or a pencil, to be filled in; unless, of course, the network had had specific experience indicating that boys always write their names with one particular instrument or another. A similar process occurs on encountering the container in a sentence like &quot;The container held the cola&quot;. In such eases the constraints impo~d by other constituents (the cola) would be expected to si~ape the representation of &quot;container&quot;, toward a smallish, hand-holdable, non-porous container; Again, this process would not necessarily specify a specific container, just the properties such a container could be predicted to have.</Paragraph> <Paragraph position="14"> l have not yet said anything about what the model would do with the altachment problem posed by the sentence &quot;The boy ate the cake that his mother baked in the oven.&quot; In this case, we would expect that the syntactic processor would pass along a constituent like (S? in-PP OVEN), and that it would be the job of the case-role processor to determine its correct attachment. Supposing that the experience the network has been exposed to includes mothers (and others) baking cakes (and other things) in ovens, we would expect that the case-role triple (P2 LOC OVEN) (where P2 stands for the reduced description of &quot;mother-baked-cake&quot;) would already be partially active as the syntactic constituent became available. Thus the incoming constituent wm,ld simply reinforce a pattern of activation thai. already reflected the correct attachment of oven.</Paragraph> <Paragraph position="15"> Current staltt~ of the model. As I previously staled, the model has not yet been hnplcmented, and so one can treat the previous section as describing the performance of a machine made out of hopeware. Nevertheless I have reason to believe it will work. CMU connectlonists now have considerable experience with representations of the kind used in the cnse-fi'ame processor (l'ouretzky & llinlon, 1985; Tourelzky, 1986; l)erlhick, 1986). A mechanism quite like the case-frame processor has been implemented by St..Iohn (1986), and it demonstrates several of the uses of semantic conslraints that I have been discussing.</Paragraph> <Paragraph position="16"> Obviously, though, even if the case-frame processor is successfid l here are many more tasks that lie ahead.</Paragraph> <Paragraph position="17"> One crucial one is the development of a cormectionist implemenlation of the synlactic processor. I helievc that we are now on the verge of understanding sequential processes in connectionist networks (see Jordan, 1986), and that this will soon make it possible to describe a complete connectionist mechanism for language processing that captures both the strengths and limitations of human language processing capabilities.</Paragraph> </Section> class="xml-element"></Paper>