File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/91/p91-1041_evalu.xml

Size: 8,509 bytes

Last Modified: 2025-10-06 14:00:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="P91-1041">
  <Title>Quasi-Destructive Graph Unification</Title>
  <Section position="5" start_page="319" end_page="321" type="evalu">
    <SectionTitle>
3. Experiments
</SectionTitle>
    <Paragraph position="0"> Table 1 shows the results of our experiments using an HPSG-based Japanese grammar developed at ATR for a conference registration telephone dialogue domain.</Paragraph>
    <Paragraph position="1"> 19h may be slightly slower becauseour unification recurses twice on a graph: once to unify and once to copy, whereas in incremental unification schemes copying is performed during the same recursion as unifying. Additional bookkeeping for incremental copying and an additional set-difference operation (i.e, complementarcs(dgl,dg2)) during unify2 may offset this, however.</Paragraph>
    <Paragraph position="2"> 'Unifs' represents the total number of unifications during a parse (the number of calls to the top-level 'unifydg', and not 'unifyl'). 'USrate' represents the ratio of successful unifications to the total number of unifications. We parsed each sentence three times on a Symbolics 3620 using both unification methods and took the shortest elapsed time for both methods ('T' represents our scheme, 'W' represents Wroblewski's algorithm with a modification to handle cycles and variables2deg). Data structures are the same for both unification algorithms (except for additional fields for a node in our algorithm, i.e., comp-arc-list, comp-arcmark, and forward-mark). Same functions are used to interface with Earley's parser and the same subfunctions are used wherever possible (such as creation and access of arcs) to minimize the differences that are not purely algorithmic. 'Number of copies' represents the number of nodes created during each parse (and does not include the number of arc structures that are created during a parse). 'Number of conses' represents the amount of structure words consed during a parse. This number represents the real comparison of the amount of space being consumed by each unification algorithm 0ncluding added fields for nodes in our algorithm and arcs that are created in both algorithms).</Paragraph>
    <Paragraph position="3"> We used Earley's parsing algorithm for the experiment. The Japanese grammar is based on HPSG analysis (\[Pollard and Sag, 1987\]) covering phenomena such as coordination, case adjunction, adjuncts, control, slash categories, zero-pronouns, interrogatives, WH constructs, and some pragmatics (speaker, hearer relations, politeness, etc.) (\[Yoshimoto and Kogure, 1989\]). The grammar covers many of the important linguistic phenomena in conversational Japanese. The grammar graphs which are converted from the path equations contain 2324 nodes. We used 16 sentences from a sample telephone conversation dialog which range from very short sentences (one word, i.e., iie 'no') to relatively long ones (such as soredehakochirakarasochiranitourokuyoushiwoookuriitashimasu &amp;quot; In that case, we \[speaker\] will send you \[hearer\] the registration form.'). Thus, the number of (top-level) unifications per sentence varied widely (from 6 to over 500).</Paragraph>
    <Paragraph position="4"> ~Cycles can be handled in Wroblewski's algorithm by checking whether an arc with the same label already exists when arcs are added to a node. And ff such an arc already exists, we destructively unify the node which is the destination of the existing arc with the node which is the destination of the arc being added. If such an arc does not exist, we simply add the arc. (\[Kogure, 1989\]). Thus, cycles can be handled very cheaply in Wroblewski's algorithm. Handling variables in Wroblewski's algorithm is basically the same as in our algorithm (i.e., Pereira's scheme), and the addition of this functionality can be ignored in terms of comparison to our algorithm. Our algorithm does not require any additional scheme to handle cycles in input dgs.</Paragraph>
    <Section position="1" start_page="320" end_page="321" type="sub_section">
      <SectionTitle>
Approaches
</SectionTitle>
      <Paragraph position="0"> The control structure of our algorithm is identical to that of \[Pereira, 1985\]. However, instead of storing changes to the argument (lags in the environment we store the changes in the (lags themselves nondestructively. Because we do not use the environment, the log(d) overhead (where d is the number of nodes in a dag) associated with Pereira's scheme that is required during node access (to assemble the whole dag from the skeleton and the updates in the environment) is avoided in our scheme. We share the principle of storing changes in a restorable way with \[Karttunen, 1986\]'s reversible unification and copy graphs only after a successful unification. Karttunen originally introduced this scheme in order to replace the less efficient structure-sharing implementations (\[Pereira, 1985\], \[Karttunen and Kay, 1985\]). In Karttunen's method 21, whenever a destructive change is about to be made, the attribute value pairs 22 stored in the body of the node are saved into an array. The dag node structure itself is also saved in another array. These values are restored after the top level unification is completed.</Paragraph>
      <Paragraph position="1"> (A copy is made prior to the restoration operation if the unification was a successful one.) The difference between Karttunen's method and ours is that in our algorithm, one increment to the global counter can invalidate all the changes made to nodes, while in Karttunen's algorithm each node in the entire argument graph that has been destructively modified must be restored separately by retrieving the attribute-values saved in an 21The discussion ofKartunnen's method is based on the D-PATR implementation on Xerox 1100 machines (\[Karttunen, 1986\]).</Paragraph>
      <Paragraph position="2"> ~'Le., arc structures: 'label' and 'value' pairs in our vocabulary.</Paragraph>
      <Paragraph position="3"> array and resetting the values into the dag structure skeletons saved in another array. In both Karttunen's and our algorithm, there will be a non-destructive (reversible, and quasi-destructive) saving of intersection arcs that may be wasted when a subgraph of a particular node successfully unifies but the final unification fails due to a failure in some other part of the argument graphs. This is not a problem in our method because the temporary change made to a node is performed as pushing pointers into already existing structures (nodes) and it does not require entirely new structures to be created and dynamically allocated memory (which was necessary for the copy (create-node) operation), z3 \[Godden, 1990\] presents a method of using lazy evaluation in unification which seems to be one SUCC~sful actualization of \[Karttunen and Kay, 1985\]'s lazy evaluation idea. One question about lazy evaluation is that the efficiency of lazy evaluation varies depending upon the particular hardware and programming language environment. For example, in CommonLisp, to attain a lazy evalaa_tion, as soon as a function is delayed, a closure (or a structure) needs to be created receiving a dynamic allocation of memory Oust as in creating a copy node). Thus, there is a shift of memory and associated computation consumed from making copies to making closures. In terms of memory cells saved, although the lazy scheme may reduce the total number of copies created, if we consider the memory consumed to create closures, the saving may be significantly canceled. In terms of speed, since delayed evaluation requires additional bookkeeping, how schemes such as the one introduced by \[Godden, 1990\] would compare with nonlazy incremental copying schemes is an open question.</Paragraph>
      <Paragraph position="4"> Unfortunately Godden offers a comparison of his algo-Z3Although, in Karttunen's method it may become rather expensive ff the arrays require resizing during the saving operation of the subgraphs.</Paragraph>
      <Paragraph position="5">  rithm with one that uses a full copying method (i.e. his Eager Copying) which is already significantly slower than Wroblewski's algorithm. However, no comparison is offered with prevailing unification schemes such as Wroblewski's. With the complexity for lazy evaluation and the memory consumed for delayed closures added, it is hard to estimate whether lazy unification runs considerably faster than Wroblewski's incremental copying scheme, ~</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML