File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/91/p91-1041_intro.xml

Size: 9,648 bytes

Last Modified: 2025-10-06 14:05:05

<?xml version="1.0" standalone="yes"?>
<Paper uid="P91-1041">
  <Title>Quasi-Destructive Graph Unification</Title>
  <Section position="3" start_page="315" end_page="318" type="intro">
    <SectionTitle>
2. Our Scheme
</SectionTitle>
    <Paragraph position="0"> We would like to introduce an algorithm which addresses the criteria for fast unification discussed in the previous sections. It also handles cycles without over copying (without any additional schemes such as those introduced by \[Kogure, 1989\]).</Paragraph>
    <Paragraph position="1"> As a data structure, a node is represented with eight fields: type, arc-list, comp-arc-list, forward, copy, comp-arc-mark, forward-mark, and copy-mark. Although this number may seem high for a graph node data structure, the amount of memory consumed is not significantly different from that consumed by other 7'Early copying' will henceforth be used to refer to early copying as defined by us.</Paragraph>
    <Paragraph position="2">  algorithms. Type can be represented by three bits; comp-arc-mark, forward-mark, and copy-mark can be represented by short integers (i.e. fixnums); and comp-arc-list (just like arc-lis0 is a mere collection of pointers to memory locations. Thus this additional information is trivial in terms of memory cells consumed and because of this dam structure the unification algorithm itself can remain simple.</Paragraph>
    <Paragraph position="4"> The representation for an arc is no different from that of other unification algorithms. Each arc has two fields for 'label' and 'value'. 'Label' is an atomic symbol which labels the arc, and 'value' is a pointer to a node.</Paragraph>
    <Paragraph position="5"> The central notion of our algorithm is the dependency of the representational content on the global timing clock (or the global counter for the current generation of unification algorithms). This scheme was used in \[Wroblewski, 1987\] to invalidate the copy field of a node after one unification by incrementing a global counter. This is an extremely cheap operation but has the power to invalidate the copy fields of all nodes in the system simultaneously. In our algorithm, this dependency of the content of fields on global timing is adopted for arc lists, forwarding pointers, and copy pointers. Thus any modification made, such as adding forwarding links, copy links or arcs during one top-level unification (unify0) to any node in memory can be invalidated by one increment operation on the global timing counter. During unification (in unifyl) and copying after a successful unification, the global timing ID for a specific field can be checked by comparing the content of mark fields with the global counter value and if they match then the content is respected; if not it is simply ignored. Thus the whole operation is a trivial addition to the original destructive unification algorithm (Pereira's and Wroblewski's unifyl).</Paragraph>
    <Paragraph position="6"> We have two kinds of arc lists 1) arc-list and comparc-list. Arc-list contains the arcs that are permanent (i.e., usual graph arcs) and compare-list contains arcs that are only valid during one graph unification operation. We also have two kinds of forwarding links, i.e., permanent and temporary. A permanent forwarding link is the usual forwarding link found in other algorithms (\[Pereira, 1985\], \[Wroblewski, 1987\], etc).</Paragraph>
    <Paragraph position="7"> Temporary forwarding links are links that are only valid during one unification. The currency of the temporary links is determined by matching the content of the mark field for the links with the global counter and if they match then the content of this field is respected 8. As in \[Pereira, 1985\], we have three types of nodes: 1) :atomic, 2) :bottom 9, and 3) :complex. :atomic type nodes represent atomic symbol values (such as Noun), :bottom type nodes are variables and :complex type nodes are nodes that have arcs coming out of them.</Paragraph>
    <Paragraph position="8"> Arcs are stored in the arc-list field. The atomic value is also stored in the arc-list if the node type is :atomic.</Paragraph>
    <Paragraph position="9"> :bottom nodes succeed in unifying with any nodes and the result of unification takes the type and the value of the node that the :bottom node was unified with.</Paragraph>
    <Paragraph position="10"> :atomic nodes succeed in unifying with :bottom nodes or :atomic nodes with the same value (stored in the arc-lis0. Unification of an :atomic node with a :complex node immediately fails. :complex nodes succeed in unifying with :bottom nodes or with :complex nodes whose subgraphs all unify. Arc values are always nodes and never symbolic values because the :atomic and :bottom nodes may be pointed to by multiple arcs (just as in structure sharing of :complex nodes) depending on grammar constraints, and we do not want arcs to contain terminal atomic values. Figure 2 is the central quasi-destructive graph unification algorithm and Figure 3 shows the algorithm for copying nodes and arcs (called by unify0) while respecting the contents of comp-arc-lists.</Paragraph>
    <Paragraph position="11"> The functions Complementarcs(dg 1,dg2) and Intersectarcs(dgl,dg2) are similar to Wroblewski's algorithm and return the set-difference (the arcs with labels that exist in dgl but not in rig2) and intersection (the arcs with labels that exist both in dgl and dg2) respectively. During the set-difference and setintersection operations, the content of comp-arc-lists are respected as parts of arc lists if the comp-arcmarks match the current value of the global timing counter. Dereference-dg(dg) recursively traverses the forwarding link to return the forwarded node. In doing so, it checks the forward-mark of the node and if the forward-mark value is 9 (9 represents a permanent forwarding link) or its value matches the current 8We do not have a separate field for temporary forwarding links; instead, we designate the integer value 9 to represent a permanent forwarding link. We start incrementing the global counter from 10 so whenever the forward-mark is not 9 the integer value must equal the global counter value to respect the forwarding link.</Paragraph>
    <Paragraph position="12">  value of *unify-global-counter*, then the function returns the forwarded node; otherwise it simply returns the input node. Forward(dgl, dg2, :forward-type) puts (the pointer to) dg2 in the forward field of dgl. If the keyword in the function call is :temporary, the current value of the *unify-global-counter* is written in the forward-mark field of dgl. If the keyword is :permanent, 9 is written in the forward-mark field of dgl. Our algorithm itself does not require any permanent forwarding; however, the functionality is added because the grammar reader module that reads the path equation specifications into dg feature-structures uses permanent forwarding to merge the additional grammatical specifications into a graph structure 1deg. The temporary forwarding links are necessary to handle reentrancy and cycles. As soon as unification (at any level of recursion through shared arcs) succeeds, a temporary forwarding link is made from dg2 to dgl (dgl to dg2 if dgl is of type :bottom). Thus, during unification, a node already unified by other recursive calls to unifyl within the same unify0 call has a temporary forwarding link from dg2 to dgl (or dgl to dg2). As a result, if this node becomes an input argument node, dereferencing the node causes dgl and dg2 to become the same node and unification immediately succeeds.</Paragraph>
    <Paragraph position="13"> Thus a subgraph below an already unified node will not be checked more than once even if an argument graph has a cycle. Also, during copying done subsequently to a successful unification, two ares converging into the same node will not cause over copying simply because if a node already has a copy then the copy is returned.</Paragraph>
    <Paragraph position="14"> For example, as a case that may cause over copies in other schemes for dg2 convergent arcs, let us consider the case when the destination node has a corresponding node in dgl and only one of the convergent arcs has a corresponding are in dgl. This destination node is already temporarily forwarded to the node in dgl (since the unification check was successful prior to copying).</Paragraph>
    <Paragraph position="15"> Once a copy is created for the corresponding dgl node and recorded in the copy field of dgl, every time a convergent arc in dg2 that needs to be copied points to its destination node, dereferencing the node returns the corresponding node in dgl and since a copy of it already exists, this copy is returned. Thus no duplicate copy is created H.</Paragraph>
    <Paragraph position="16"> roWe have been using Wroblewski's algorithm for the unification part of the parser and thus usage of (permanent) forwarding links is adopted by the grammar reader module to convert path equations to graphs. For example, permanent forwarding is done when a :bottom node is to be merged with other nodes.</Paragraph>
    <Paragraph position="17"> nCopying of dg2 ares happens for arcs that exist in dg2 but not in dgl (i.e., Complementarcs(dg2,dgl)). Such arcs are pushed to the cornp-arc-list of dgl during unify1 and are copied into the are-list of the copy during subsequent copying. If there is a cycle or a convergence in arcs in dgl or in ares in dg2 that do not have corresponding arcs in dg 1, then the mechanism is even simpler than the one discussed here. A copy is made once, and the same copy is simply returned</Paragraph>
    <Paragraph position="19"/>
  </Section>
class="xml-element"></Paper>
Download Original XML