File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/c86-1113_metho.xml

Size: 18,089 bytes

Last Modified: 2025-10-06 14:11:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1113">
  <Title>Distributed Memory: A Basis for Chart Parsing</Title>
  <Section position="3" start_page="476" end_page="476" type="metho">
    <SectionTitle>
2. DISTRIBUTED CHART PARSING
REPRESENTATION
</SectionTitle>
    <Paragraph position="0"> The aim of chart parsing using distributed representations is to derive a composite memory vector which encodes the type of structural information shown in figure 1. The figure shows the constituent structure (C-structure) built for the sentence The man hit the boy.</Paragraph>
    <Paragraph position="1">  This structure incorporates features of different notations used to describe chart parsing schemes (Earley, 1970; Kay, 1980). The structure represents a passive lattice consisting of one cell, or an equivalent edge, per constituent. It embodies structural information relating to the immediate-dominance of constituents, as well as information about the range of terminals spanned by each constituent. This structure can be collapsed under the operations of association and concatenation to derive a composite memory vector representation which preserves both types of structural information. Each constituent category is encoded as a random memory vector; &lt;S&gt;, &lt;NP&gt;, &lt;DET&gt; and so on. The vertices of the chart (1-6 in figure 1 ) are mapped onto the set of decay indices registered by the working memory systems at each input point. Moving from left to right through the table shown in figure 1 the vectors are combined by tile concatenation operator. The dominance structure exhibited within the table is preserved by the association operator going from bottom to top. These procedures produce the following composite vector which encodes the information represented in figure 1 :</Paragraph>
    <Paragraph position="3"> Each component of the pattern consists of a vector encoding a constituent label associated with the input indices defining the left and right edges of the constituent. This form of representation is similar to the idea of assertion sets which have been shown to have useful properties as parsing representations (Barton and Berwick, 1985). One of the major advantages of the representation is that the two types of structural information are jointly accessible using the appropriate retrieval vector, that is category label vector. For example, if the composite vector is accessed using the retrieval vector &lt;VP&gt;, then the retrieved pattern encodes both the categories dominated by VP and the range of input terminals covered by VP.</Paragraph>
  </Section>
  <Section position="4" start_page="476" end_page="478" type="metho">
    <SectionTitle>
3. PARSING ARCHITECTURE
</SectionTitle>
    <Paragraph position="0"> A possible parsing architecture for realising the above scheme is shown in figure 2. The parsing mechanism consists of three distributed memory units, one permanent unit which stores context-free rule patterns, and two working memory units which encode temporary constituent structures. In figure 2, the stored rules pattern memory adds retrieved lists of rule constituents to the active patterns working memory.</Paragraph>
    <Paragraph position="1"> The double arrows on the lines connecting to the inactive patterns memory depict the two-way interactions between the inactive patterns and both the stored rule patterns and active patterns.</Paragraph>
    <Paragraph position="2"> The input to the parsing mechanism comprises the syntactic categories of each constituent of the input string. This input is received by the inactive patterns working memory and each new input triggers the indexing of patterns in accordance with the decay function. Thus, while a category label vector is held on the inactive patterns unit its decay index is continually up-dated.</Paragraph>
    <Paragraph position="3">  and the rules are stored in the permanent memory unit in terms of the following patterns: &lt;Cil &gt;*(&lt;Li&gt;*&lt;Cati&gt;//&lt;Ci2&gt;*. ...... *&lt;Cik&gt;*&lt;Li&gt; ) The vector &lt;Cil&gt; encodes the first constituent on the RHS of the CF rule labeled &lt;Li&gt;. When this constituent is input to the rule patterns store the split pattern with which it is associated is retrieved as output. The retrieved pattern is split in that the two halves of the pattern are output over different lines. The first half of the pattern, &lt;Li&gt;*&lt;Cati&gt;, is output to the inactive patterns unit, and the second half, &lt;Ci2&gt;*....*&lt;Cik&gt;*&lt;Li&gt;, is output to the active patterns memory unit. This retrieval process is equivalent to building an active edge in active chart parsing (Kay, 1980; Thompson, 1983). A major difference, however, is that the vector encoding the active edge is split over the two working memory units.</Paragraph>
    <Paragraph position="4"> The active patterns unit now encodes the list of remaining constituents which must occur for the RHS of rule &lt;Li&gt; to be satisfied. Meanwhile, the inactive patterns unit encodes the category of the hypothesised, or active, edge. If the list of necessary constituents is matched by later inptus then the label of the satisfied rule is retrieved on the active patterns unit and output to the inactive patterns store. This new input retrieves the constituent category associated with the rule label on the inactive patterns unit. This retrieval process is equivalent to building an inactive edge in standard active chart parsing. Each time a new inactive edge pattern is created it is output to both the other units to determine whether it satisfies any of the constituents of a  stored rule, or any remaining active patterns. Those active patterns which do not have their constituent elements matched rapidly decay, and fade out of the composite traces held on the two working memory units. When a rule is satisfied and an inactive pattern, or edge, is built the index range spanned by the rule is retrieved at the same time as the rule's category vector. Thus, an inactive pattern is encoded as &lt;Cati&gt;*&lt;m-q&gt;, where &lt;m-q&gt; encodes the range of input indices spanned by the category &lt;Cati&gt;.</Paragraph>
    <Paragraph position="5"> Within this scheme the alternative CF rules for a particular constituent category have to be encoded as separate rule patterns. For example, the rules for building NPs would include the following:  1: NP ---&gt; DETNP 2 2: NP---&gt; NP 2 3: NP---&gt; NP PP 4: NP 2--&gt; N 5: NP2---&gt; ADJ NP 2  To simulate a top-down parsing scheme the active elements encoded on the inactive patterns unit can function as retrieval vectors to the stored rules unit. This feedback loop enables a retrieved category to retrieve further rule patterns; those for which it is the first constituent. In this way, all the rule patterns which are applicable at a particular point in the parse are retrieved and held as active edges split across the two working memory units. On each cylce of feedback the newly retrieved rule categories are associated with the active elements which retrieved them to form patterns such as &lt;La&gt;*&lt;S&gt;*(&lt;Lb&gt;*&lt;NP&gt; ).</Paragraph>
    <Paragraph position="6"> On successfully parsing a sentence the inactive patterns unit holds a collection of inactive edge patterns which form a composite vector of the type described in section 2. All active patterns, and those inactive edges which are incompatible with the final parse rapidly decay leaving only the derived constituent structure in working memory. If an input sentence is syntactically ambiguous then all possible C-structures are encoded on the final composite vector. That is, as it stands the parsing scheme does not embody a theory of syntactic closure, or preference.</Paragraph>
  </Section>
  <Section position="5" start_page="478" end_page="478" type="metho">
    <SectionTitle>
4. A PARSING EXAMPLE
</SectionTitle>
    <Paragraph position="0"> The parsing architecture described above has been implemented on standard sequential processing machinery. To achieve this it was necessary to decide on computational functions for the three memory operators, and to build an agenda mechanism to schedule the ordering of the memory unit input/output processes. The association and retrieval operators were accurately simulated using convolution and correlation functions, respectively. The concatenation operator was simulated through a combination of vector addition and a normalisation function. All syntactic categories, rule labels, and input indices were encoded as randomly assigned 1000-element vectors.</Paragraph>
    <Paragraph position="1"> The agenda mechanism imposes the following sequences on input-output operations: \[A\] The input lines to the inactive patterns unit are processed in the order - constituent input, input from stored rule patterns unit, and finally, input from active patterns unit.</Paragraph>
    <Paragraph position="2"> \[B\] The retrieval and output operations for the units are cycled through in the order - activation of stored rule patterns, including top-down feedback; matching of active patterns, and building of inactive edge patterns.</Paragraph>
    <Paragraph position="3"> \[C\] The final operation is to increment the input indices, before accepting the next constituent input. To illustrate the operation of the parsing mechanism it is useful to consider an example. As a simple example, assume that the stored rule patterns unit holds the following context free rules:  1: NP---&gt; DETNP 2 6: S---&gt; NP VP 2: NP---&gt; NP 2 7: VP---&gt; V 3: NP ---&gt; NP PP 8: VP---&gt; V NP 4: NP 2-*-&gt; N 9: VP---&gt; VP PP 5: NP2---&gt; ADJ NP 2 10: PP---&gt; Prep NP  In parsing the sentence The old man the boats the lexical ambiguity associated with the words old and man produces the input string &lt;Det&gt; &lt;Adj+N&gt; &lt;N+V&gt; &lt;Det&gt; &lt;N&gt; The first input vector, &lt;Det&gt;, retrieves rule 1, setting up &lt;NP2&gt; on the active patterns unit and &lt;I&gt;*&lt;NP&gt;*(&lt;Det&gt;) on the inactive patterns unit (the input indices have been omitted to simplify the notation). Through feedback, rules 3 and 6 are also retrieved creating the pattern &lt;NP2&gt;*&lt;I&gt;+&lt;PP&gt;*&lt;3&gt;+&lt;VP&gt;*&lt;6&gt; on the active patterns unit, and the pattern &lt;6&gt;*&lt;s&gt;*((&lt;1 &gt;*&lt;NP&gt;*(&lt; Det&gt;)+(&lt;6&gt;*&lt;N P&gt;*(&lt;I &gt;*&lt;NP&gt;*(&lt;Det&gt;))) on the inactive patterns unit.</Paragraph>
    <Paragraph position="4"> On input of the second word which covers the categories Adj and N, the composite vector &lt;Adj+N&gt; retrieves rules 4 and 5, and through feedback rules 2, 3 and 6. The ,:N&gt; pattern component of the input vector satisfies the &lt;I&gt;*&lt;NP2&gt; pattern held on the active patterns unit which leads to the creation of the first inactive edge pattern, &lt;6&gt;*&lt;S&gt;*(&lt;NP&gt;*(&lt;Det&gt;+&lt;N&gt;)). The third input is also ambiguous and triggers a large number of rules, including rules 7 and 8. Again, the &lt;N&gt; pattern in the input vector triggers the construction of an NP inactive pattern &lt;6&gt;*&lt;S&gt;*(&lt;N P&gt;*(&lt;Det&gt;+&lt;Adj&gt;+&lt;N&gt;)).</Paragraph>
    <Paragraph position="5"> In addition, the retrieval of rule 7 produces the inactive edge pattern &lt;S&gt;*(&lt;NP&gt;*(&lt;Det&gt;+&lt;N&gt;)+&lt;VP&gt;*&lt;V&gt;)), corresponding to a premature interpretation of the input as The old(N) man(V). However, this pattern rapidly decays as new input is received.</Paragraph>
    <Paragraph position="6"> The final two inputs set up another &lt;NP&gt; inactive edge which when output to the active patterns unit retrieves the label vector associated with the constituents of rule 8. This produces a &lt;VP&gt; inactive edge which through feedback to the active patterns unit retrieves the label vector &lt;6&gt;, as it completes the &lt;S&gt; rule. In turn, an inactive edge pattern for &lt;S&gt; is created. To complete the parse, the period symbol, or corresponding vector, can be used to trigger an operation which refreshes only those inactive edges which contain the final &lt;S&gt; pattern. All other patterns decay rapidly leaving only the constituent structure pattern on the inactive patterns unit.</Paragraph>
    <Paragraph position="7"> This parsing example is not concordant with most people's reading of the sentence The old man the boats. In the first instance, the sentence tends to be parsed as two consecutive NPs, (The old man)(the boats), Such phenomena would tend to imply that the human parser is more deterministic then the present model suggests. However, the model is net a complete comprehension mechanism and other factors influence the final outcome of the parse. In particular, many of the phenomena associated with closure and ambiguity can be explained in terms of lexical items having preferred interpretations (Kaplan and Bresnan, 1982). In the present example, the preferred categories for old and man are Adj and N, respectively. Such an idea is easily accommodated within the present scheme in terms of the concept of pattern strength (see Slack, 1984b, for details).</Paragraph>
  </Section>
  <Section position="6" start_page="478" end_page="479" type="metho">
    <SectionTitle>
5. MACRO- AND MICRO-LEVEL
DESCRIPTIONS
</SectionTitle>
    <Paragraph position="0"> The parsing architecture described in section 3 specifies a viwtual machine which is compatible with both traditional sequential processing based on Von Neumann architecture, and massively parallel processing using architectures based on connectionist principles. The parsing scheme outlined in the last section represents a macro-level implementation of the distributed representation parsing mechanism. The memory operators and agenda control structures that determine the sequence of states of the mechanism are well-specified, and the parallelism implicit in the system is buried deep in the representation. However, the concept of a memory vector, and its theoretical operators, can also be mapped onto connectionist concepts to provide a micro-level implementation of the system.</Paragraph>
    <Paragraph position="1"> A connectionist system would employ three layers, or collections, of elementary processing units. These sets of units correspond to the three distributed memory units, and the gross connections between them would be the same. Within each set of units an item of  information is encoded as a distributed representation, that is, as a different pattern of activation over the units. The memory operations are modeled in terms of excitatory and inhibitory processes between units of the same and different layers. As with all connectionist models, it is necessary to delineate the principles which determine the settings of the input weights and unit thresholds. However, no global sequencing mechanism is required as the activation processes have their influence over a number of interative cycles. Each new input pattern stimulates the activity of the system, and through their mutual interactions the three sets of units gradually converge on an equilibrium state. Providing the connection weights and unit thresholds are set correctly, the two working memory layers should encode the types of active and inactive patterns described previously.</Paragraph>
    <Paragraph position="2"> The major advantage of the distributed representation parsing scheme is that it obviates the need for special-purpose binding units as used in other connectionist parsing systems (Selman and Hirst, 1985; Cottrell, 1985). The function of such units can be achieved through the appropriate use of the association operator.</Paragraph>
  </Section>
  <Section position="7" start_page="479" end_page="480" type="metho">
    <SectionTitle>
6. TOWARD A COMPLETE LANGUAGE INPUT
ANALYSER
</SectionTitle>
    <Paragraph position="0"> The main goal of this research is to establish the relationship between language understanding and memory. This involves building a complete language analysis system based on a homogenous set of memory processing principles. These principles would seem to include, (a) the interaction of mutual constraints, (b) a high degree of parallel processing, and (c) the necessity for distributed representations to facilitate the simple concatenation of multiple constraints. Such principles need to be matched against linguistic models and constructs to derive the most integrated account of linguistic phenomena.</Paragraph>
    <Paragraph position="1"> Within these goals, the present chart parsing scheme is not allied to any particular grammatical formalism. However, distributed memory systems have previously been used as the basis for a parsing mechanism based on the principles of lexical functional grammar (Slack, 1984a). Originally, this system incorporated a conventional CFG module, but this can now be simulated using distributed memory systems.</Paragraph>
    <Paragraph position="2"> Thus, the processing underlying the language analyser is based on a more homogenous set of principles, and more importantly, memory principles.</Paragraph>
    <Paragraph position="3"> As a component of the language analyser the CFG module generates a constituent structure comprising an augmented phrase-structure tree. This structure is derived using augmented CF rules of the form:  These augmented rules are easily assimilated into the present chart parsing scheme as follows: The functional equations {e.g., (t Subj=j,)} are replaced by grammatical function vectors; &lt;SUB J&gt;, &lt;OBJ&gt;, and so on. These vectors are encoded on the stored CF rule patterns as a third form of output associated with sub-patterns such as (&lt;6&gt;*&lt;S&gt;*&lt;NP&gt;). The function vectors are output to a working distributive memory system which encodes the functional structure of an input sentence. As long as its associated sub-pattern is active within the CFG module the appropriate function vector will be output. This means that a particular grammatical function vector will only be active in the system while the rule and constituent with which it is associated are also active.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML