XML Viewer - p84-1022

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/84/p84-1022_metho.xml
Size: 16,210 bytes
Last Modified: 2025-10-06 14:11:37
<?xml version="1.0" standalone="yes"?>
<Paper uid="P84-1022">
  <Title>A PARSING ARCHITECTURE BASED ON DISTRIBUTED MEMORY MACHINES</Title>
  <Section position="3" start_page="0" end_page="92" type="metho">
    <SectionTitle>
II DISTRIBUTED MEMORY MACHINES
</SectionTitle>
    <Paragraph position="0"> Distributed memory machines (DMM) can be represented formally by the septuple DMM=(V,X,Y,Q,qo,p,A) , where V is a finite set denoting the total vocabulary; X is a finite set of inputs, and XGV; Y is a finite set of acceptable outputs and Y~V; Q is a set of internal states; q0 is a distinguished initial state; ~.QxX--&gt;Q, the retrieval function; A:Q--&gt;Qxy, the output function.</Paragraph>
    <Paragraph position="1"> Further, where Y&amp;quot; denotes the set of all finite concatenations of the elements of the set Y, Q~Y', and therefore QgV'. This statement represents the notion that internal states of DMMs can encode multiple outputs or hypotheses. The vocabulary, V, can be represented by the space I k, where I is some interval range defined within a chosen number system, N; IoN. The elements of X, Y and Q are encoded as k-element vectors, referred to as memory vectozs.</Paragraph>
    <Paragraph position="2"> A. Holographic Associative Memory One form of DMM is the holographic associative memory \[4,5,6\] which encodes large numbers of associations on a single composite vector.</Paragraph>
    <Paragraph position="3"> Items of information are encoded as k-element zero-centred vectors over an interval such as \[-I,+I\]; &lt;X&gt;=(...x.t,x0,x~t,...). Two items, &lt;A&gt; and &lt;B&gt; (angular brackets denote memory vectors), are associated in memory through the operation of convolution. This method of association formation is fundamental to the concept of holographic memory and the resulting associative trace is denoted &lt;A&gt;*&lt;B&gt;. The operation of convolution is define by the equation (&lt;A&gt;*&lt;B&gt;~=.~AIB~. i and has the following propertles*\[7\]: Commutative: &lt;A&gt;*&lt;B&gt; = &lt;B&gt;*&lt;A&gt;, Associative: &lt;A&gt;*(&lt;B&gt;*&lt;C&gt;) = (&lt;A&gt;*&lt;B&gt;)*&lt;C&gt;.</Paragraph>
    <Paragraph position="4"> Further, where a delta vector, denoted ~, is defined as a vector that has values of zero on all features except the central feature, which has a value of one, then &lt;A&gt;* ~ffi &lt;A&gt;. Moreover, &lt;A&gt;*0 ffi 0, where 0 is a zero vector in which all feature values are zero. Convolving an item wlth an attenuated delta vector (i.e., a vector with values of zero on all features except the central one, which has a value between 0 and i) produces the original item with a strength that is equal to the value of the central feature of the attenuated delta vector.</Paragraph>
    <Paragraph position="5"> The initial state, qo, encodes all the associations stored in the machine. In this model, associative traces are concatenated (+) through the operations of vector addition and normalization to produce a single vector.</Paragraph>
    <Paragraph position="6"> Overlapping associative items produce composite  vectors which represent both the range of items stored and the central tendency of the those items. This form of prototype generation is a basic property of distributed memories.</Paragraph>
    <Paragraph position="7"> The retrieval function,@ , is simulated by the operation of correlation. If the state, q~, encodes the association &lt;A&gt;*&lt;B&gt;, then presenting say &lt;A&gt; as an input, or retrieval key, produces a new state, q{~, which encodes the item &lt;B&gt;', a noisy version of &lt;B&gt;, under the operation of correlation. This operation is defined by the equation (&lt;A&gt;#&lt;B&gt;~=~A%Bm,%and has the following properties: % An item correlated with itself, autocorrelation, produces an approximation to a delta vector. If two similar memory vectors are correlated, the central feature of the resulting vector will be equal to their similarity, or dot product, producing an attenuated delta vector. If the two items are completely independent, correlation produces a zero vector.</Paragraph>
    <Paragraph position="8"> The relation between convolution and correlation is given by</Paragraph>
    <Paragraph position="10"> where the noise component results from some of the less significant cross products. Assuming that &lt;A&gt; and &lt;B&gt; are unrelated, Equation (I) becomes:</Paragraph>
    <Paragraph position="12"> Extending these results to a composite trace, suppose that q encodes two associated pairs of four unrelated items forming the vector (&lt;A&gt;*&lt;B&gt; + &lt;C&gt;*&lt;D&gt;). When &lt;A&gt; is given as the retrieval cue, the reconstruction can be characterized as follows:</Paragraph>
    <Paragraph position="14"> When the additional unrelated items are added to the memory trace their affect on retrieval is to add noise to the reconstructed item &lt;B&gt;, which was associated with the retrieval cue. In a situation in which the encoded items are related to each other, the composite trace causes all of the related items to contribute to the reconstructed pattern, in addition to producing noise. The amount of noise added to a retrieved item is a function of both the amount of information held on the composite memory vector and the size of the vector.</Paragraph>
  </Section>
  <Section position="4" start_page="92" end_page="94" type="metho">
    <SectionTitle>
III BUILDING NATURAL LANCUACZ PARSERS
A. Case-Frame Parsing
</SectionTitle>
    <Paragraph position="0"> The computational properties of distributed memory machines (DMM) make them natural mechanisms for case-frame parsing. Consider a DMM which encodes case-frame structures of the following form: &lt;Pred&gt;*(&lt;Cl&gt;*&lt;Pl&gt; + &lt;C2&gt;*&lt;P2&gt; + ...+ &lt;Cn&gt;*&lt;Pn&gt;) where &lt;Pred&gt; is the vector representing the predicate associated with the verb of an input clause; &lt;C1&gt; to &lt;Cn&gt; are the case vectors such as &lt;agent&gt;, &lt;instrument&gt;, etc., and &lt;PI&gt; to &lt;Pn&gt; are vectors representing prototype concepts which can fill the associated cases. These structures can be made more complex by including tagging vectors which indicate such features as obligatory case, as shown in the case-frame vector for the predicate BREAK: (&lt;agent&gt;*&lt;anlobJ+natforce&gt; + &lt;obJect&gt;*&lt;physobJ&gt; *&lt;obllg&gt; + &lt;instrument&gt;*&lt;physobJ&gt;) In this example, the object case has a prototype covering the category of physical objects, and is tagged as obligatory.</Paragraph>
    <Paragraph position="1"> The initial state of the DMM, qo, encodes the concatenation of the set of case-frame vectors stored by the parser. The system receives two types of inputs, noun concept vectors representing noun phrases, and predicate vectors representing the verb components. If the system is in state qo only a predicate vector input produces a significant new state representing the case-frame structure associated with it.</Paragraph>
    <Paragraph position="2"> Once in this state, noun vector inputs identify the case slots they can potentially fill as illustrated in the following example: In parsing the sentence Fred broke the window with e stone, the input vector encodin E broke will retrieve the case-frame structure for break given above. The input of &lt;Fred&gt; now gives</Paragraph>
    <Paragraph position="4"> where ej is a measure of the similarity between the vectors, and underlying concepts, &lt;Fred&gt; and the case prototype &lt;Pj&gt;. In this example, &lt;Fred&gt; would be identified as the agent because e 0 and e~ would be low relative to ee. The vector is &amp;quot;cleaned-up&amp;quot; by a threshold function which is a component of the output function,)%.</Paragraph>
    <Paragraph position="5"> This process is repeated for the other noun concepts in the sentence, linking &lt;window&gt; and &lt;stone&gt; with the object and instrument cases, respectively. However, the parser requires additional machinery to handle the large set of sentences in which the case assignment is ambiguous using semantic knowledge alone.</Paragraph>
    <Paragraph position="6"> B. Encodin~ Syntactic Knowledge Unambiguous case assignment can only be achieved through the integration of syntactic and semantic processing. Moreover, an adequate parser should generate an encoding of the grammatical relations between sentential elements in addition to a semantic representation. The rest of the paper demonstrates how the properties of DMMs can be combined with the ideas embodied in the theory of Lextcal-functional CTammar (LFG) \[8\] in a parser which builds both types of relational structure.</Paragraph>
    <Paragraph position="7">  In LFG the mapping between grammatical and semantic relations is represented directly in the semantic form of the lexlcal entries for verbs. For example, the lexlcal entry for the verb hands is given by hands: V, #participle) = NONE</Paragraph>
    <Paragraph position="9"> where the arguments of the predicate HAND are ordered such that they map directly onto the arguments of the semantic predicate-argument structure. The order and value of the arguments in a lexical entry are transformed by lexlcal rules, such as the passive, to produce new lexical entries, e.g., HAND\[#byobJ)~subJ)(~oobJ)\]. The direct mapping between lexical predicates and case-frame structures is encoded on the case-frame DMM by augmenting the vectors as follows:</Paragraph>
    <Paragraph position="11"> When the SUBJ component has been identified through syntactic processing the resulting association vector, for example &lt;subJ&gt;*&lt;John&gt; for the sentence John handed Mary the book, will retrieve &lt;agent&gt; on input to the CF-DMM, according to the principles specified above.</Paragraph>
    <Paragraph position="12"> The multiple lexical entries produced by lexical rules have corresponding multiple case-frame vectors which are tagged by the appropriate grammatical vector. The CF-DMM encodes multiple case-frame entries for verbs, and the grammatical vector tags, such as &lt;PASSIVE&gt;, generated by the syntactic component, are input to the CF-DMM to retrieve the appropriate case-frame for the verb.</Paragraph>
    <Paragraph position="13"> The grammatical relations Between the sententlal elements are represented in the form of functional structure (f-structures) as in LFG. These structures correspond to embedded lists of attrlbute-value pairs, and because of the Uniqueness criterion which governs their format they are efficiently encoded as memory vectors. As an example, the grammatical relations for the sentence John handed Mary a book are encoded in the f-structure below:  The lists of grammatical functions and features are encoded as single vectors under the + operator, and the embedded structure is preserved by the associative operator, *. The f-structure is encoded by the vector  This compatibility between f-structures and memory vectors is the basis for an efficient procedure for deriving f-structures from input strings. In LFG f-structures are generated in three steps. First, a context-free grammar (CFG) is used to derive an input string's constituent structure (C-structure). The grammar is augmented so that it generates a phrase structure tree which includes statements about the properties of the string's f-structure. In the next step, this structure is condensed to derive a series of equations, called the functional description of the string. Finally, the f-structure is derived from the f-description. The properties of DMMs enable a simple procedure to be written which derives f-structures from augmented phrase structure trees, obviating the need for an f-descrlptlon. Consider the tree in figure 1 generated for our example sentence:  The f-structure, encoded as a memory vector, can be derived from this tree by the following procedure. First, all the grammatical functions, features and semantic forms must be encoded as vectors. The~-variables, f,-f#, have no values at this point; they are derived by the procedure. All the vectors dominated by a node are concatenated to produce a single vector at that node. The symbol '=&amp;quot; is interpreted as the association operator ,*.</Paragraph>
    <Paragraph position="14"> Applying this interpretation to the tree from the bottom up produces a memory vector for the value of f! which encodes the f-structure for the string, as given above. Accordingly, f~ takes the value (&lt;TNUM&gt;*&lt;SG&gt;+&lt;TPRED&gt;*&lt;JOHN&gt;); applying the rule specified at the node, (f, SUBJ)=f~ gives &lt;tSUBJ&gt;*(&lt;tNUM&gt;*&lt;SG&gt;+&lt;TPRED&gt;*&lt;JOHN&gt;) as a component of f,. The other components of fl are derived in the same way. The front-end CFG can be veiwed as generating the control structure for the derivation of a memory vector which represents the input string's f-structure.</Paragraph>
    <Paragraph position="15">  The properties of memory vectors also enable the procedure to automatically determine the consistency Df the structure. For example, in deriving the value of f&amp; the concatenation operator merges the (%NUM)~SG features for A and book to form a single component of the f~vector, (&lt;SPEC&gt;*&lt;A&gt;+&lt;NUM&gt;*&lt;SG&gt;+&lt;PRED&gt;*&lt;MARY&gt;). .owever, if the two features had not matched, producing the vector component &lt;NU}~*(&lt;SG&gt;+&lt;PL&gt;) for example, the vectors encoding the incompatible feature values are set such that their concatenation produces a special control vector which signals the mismatch.</Paragraph>
    <Paragraph position="16"> C. A Parsing Architecture The ideas outlined above are combined in the design of a tentative parsing architecture shown in figure 2. The diamonds denote DMMs, and the</Paragraph>
    <Paragraph position="18"> ellipse denotes a form of DMM functioning as a working memory for encoding temporary f-structures.</Paragraph>
    <Paragraph position="19"> As elements of the input string enter the lexicon their associated entries are retrieved.</Paragraph>
    <Paragraph position="20"> The syntactic category of the element is passed onto the CFG, and the lexical schemata {e.g., ~PRED)='JOHN'}, encoded as memory vectors, are passed to the f-structure working memory. The lexical entry associated with the verb is passed to the case-frame memory to retrieve the appropriate set of structures. The partial results of the CFG control the formation of memory vectors in the f-structure memory, as indicated by the broad arrow. The CFG also generates grammatical vectors as inputs for case-frame memory to select the appropriate structure from the multiple encodings associated with each verb. The partial f-structure encoding can then be used as input to the case-frame memory to assign the semantic forms of grammatical functions to case slots. When the end of the string is reached both the case-frame instantiation and the f-structure should be complete.</Paragraph>
  </Section>
  <Section position="5" start_page="94" end_page="94" type="metho">
    <SectionTitle>
IV CONCLUSIONS
</SectionTitle>
    <Paragraph position="0"> This paper attempts to demonstrate the value of distributed memory machines as components of a parsing system which generates both semantic and grammatical relational structures. The ideas presented are similar to those being developed within the connectlonist paradigm \[I\]. Small, and his colleagues \[9\], have proposed a parsing model based directly on connectionist principles.</Paragraph>
    <Paragraph position="1"> The computational architecture consists of a large number of appropriately connected computing units communicating through weighted levels of excitation and inhibition. The ideas presented here differ from those embodied in the connectionist parser in that they emphasise distributed information storage and retrieval, rather than distributed parallel processing. Retrieval and filtering are achieved through simple computable functions operating on k-element arrays, in contrast to the complex interactions of the independent units in connectlonist models. In figure 2, although the network of machines requires heterarchical control, the architecture can be considered to be at the lower end of the family of parallel processing machines \[i0\].</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML