File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/a00-2022_intro.xml
Size: 6,221 bytes
Last Modified: 2025-10-06 14:00:39
<?xml version="1.0" standalone="yes"?> <Paper uid="A00-2022"> <Title>O~ Proand Retroactive Packing I \] o passive edges \</Title> <Section position="4" start_page="162" end_page="163" type="intro"> <SectionTitle> 2 Efficient Subsumption and </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="162" end_page="163" type="sub_section"> <SectionTitle> Equivalence Algorithms </SectionTitle> <Paragraph position="0"> Our feature structure subsumption algorithm 2 assumes totally well-typed structures (Carpenter, 1992) and employs similar machinery to the quasi-destructive unification algorithm described by Tomabechi (1991). In particular, it uses temporary pointers in dag nodes, each pointer tagged with a generation counter, to keep track of intermediate results in processing; incrementing the generation counter invalidates all temporary pointers in a single operation. But whereas quasi-destructive unification makes two passes (determining whether the unification will be successful and then copying out the intermediate representation) the subsumption algorithm makes only one pass, checking reentrancies and type-supertype relationships at the same time. 3 The algorithm, shown in Figure 1, also simultaneously tests if both feature structures subsume each other (i.e. they are equivalent), if either subsumes the other, or if there is no subsumption relation between them in either direction.</Paragraph> <Paragraph position="1"> The top-level entry point dag-subsumes-pO and subsidiary function dag-subsumes-pO 0 each return two values, held in variables \]orwardp and backwardp, both initially true, recording whether it is possible that the first dag subsumes the second and/or vice-versa, respectively. When one of these possibilities has been ruled out the appropriate variable is set to false; in the statement of the algorithm the two returned values are notated as a pair, i.e.</Paragraph> <Paragraph position="2"> (/orwardp, backwardp). If at any stage both variables have become set to false the possibility of subsumption in both directions has been ruled out so the algorithm exits.</Paragraph> <Paragraph position="3"> The (recursive) subsidiary function dag-subsumes-pO 0 does most of the work, traversing the two input the type of the value of p in F is a supertype or equal to the value in G, and (2) all paths that are reentrant in F are also reentrant in G.</Paragraph> <Paragraph position="4"> dags in step. First, it checks whether the current node in either dag is involved in a reentrancy that is not present in the other: for each node visited in one dag it adds a temporary pointer (held in the 'copy' slot) to the corresponding node in the other dag. If a node is reached that already has a pointer then this is a point of reentrancy in the dag, and if the pointer is not identical to the other dag node then this reentrancy is not present in the other dag.</Paragraph> <Paragraph position="5"> In this case the possibility that the former dag subsumes the latter is ruled out. After the reentrancy check the type-supertype relationship between the types at the current nodes in the two dags is determined, and if one type is not equal to or a supertype of the other then subsumption cannot hold in that direction. Finally, after successfully checking the type-supertype relationships, the function recurses into the arcs outgoing from each node that have the same label. Since we are assuming totally well-typed feature structures, it must be the case that either the sets of arc labels in the two dags are the same, or one is a strict superset of the other. Only arcs with the same labels need be processed; extra arcs need not since the type-supertype check at the two nodes will already have determined that the feature structure containing the extra arcs must be subsumed by the other, and they merely serve to further specify it and cannot affect the final result.</Paragraph> <Paragraph position="6"> Our implementation of the algorithm contains extra redundant but cheap optimizations which for reasons of clarity are not shown in figure 1; these include tests that forwardp is true immediately before the first supertype check and that backwardp is true before the second. 4 The use of temporary pointers means that the space complexity of the algorithm is linear in the sum of the sizes of the feature structures. However, in our implementation the 'copy' slot that the pointers occupy is already present in each dag node (it is required for the final phase of unification to store new nodes representing equivalence classes), so in practice the subsumption test does not allocate any new storage. All pointer references take constant time since there are no chains of 'forwarded' pointers (forwarding takes place only during the course of unification and no forwarded pointers are left afterwards). Assuming the supertype tests can be carried 4There is scope for further optimisation of the algorithm in the case where dagl and dag2 are identical: full processing inside the structure is not required (since all nodes inside it will be identical between the two dags and any strictly internal reentrancies will necessarily be the same), but we would still need to assign temporary pointers inside it so that any external reentrancies into the structure would be treated correctly. In our tests we have found that as far as constituents that are candidates for local ambiguity packing are concerned there is in fact little equality of structures between them, so special equality processing does not justify the extra complication. out in constant time (e.g. by table lookup), and that the grammar allows us to put a small constant upper bound on the intersection of outgoing arcs from each node, the processing in the body of dag-subsumes-pO 0 takes unit time. The body may be executed up to N times where N is the number of nodes in the smaller of the two feature structures. So overall the algorithm has linear time complexity. In practice, our implementation (in the environment described in Section 4) performs of the order of 34,000 top-level feature structure subsumption tests per second.</Paragraph> </Section> </Section> class="xml-element"></Paper>