File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-4185_metho.xml
Size: 19,557 bytes
Last Modified: 2025-10-06 14:13:00
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-4185"> <Title>UNIFYING DISJUNCTIVE FEATURE STRUCTURES</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Disjunction is all important extension to feature structure languages since it increases the compactuess of the descriptions. The mmn problem with including disjunction in tile structures is that the unification operation becomes NP-complete. Therelore there have been many proposals on how to unify disjunctive feature structures, the most important being Karttunen's (1984) unification with constraints, Kasper's (1987) unification by successive approximation, Eisele & D0rru's (1988) value unification and lately Eisele & D0rre's (1990a, b) unification with named disjunctions. Since Kasper's and Eisele & D0rre's algorithms seem to be more general and efficient than Karttunen's algorithm I will restrict my discussion to them.</Paragraph> <Paragraph position="1"> hi Kasper's algorithin the structures to be unified are divided into two parts, one that does not contain any disjunctions and one that is a conjunction of all disjunctions in the structure. Tile idea is to unify the non-disjunctive parts first and then unify the result with the disjunctions, thus trying to exclude as many alternatives as possible. The last step is to compare all disjunctions with each other, making it possible to discard further alternatives. At is this comparison that is expensive. The algorithm is always expensive for disjunctions, regardless of whether they coutain path equivalences or not and independent of whether they are affected by the unification or not. This is due to the representation, where all disjunctions are moved to the top level of the strncture, which means that larger parts of the structures are moved into the disjunctions and must be compared by the algorithm. Carter (1990) has made a development of this algorithm which improves the efficiency when nsed together with bottom-up parsing.</Paragraph> <Paragraph position="2"> Eisele & D01Te'S (1988) approach is based on the fact that unification of path equivalences should return uot only a local value, but also a global value that affects some other part of the struetm'e. Their solution is to compute tbe local value and save tile global value a~s a global Jesuit. The global results will be unified with the result of the first unification. This new unification can also generate a new global disjunction so that the unification with global results will be repeated until no new global result is generated. This solution generates at least one, but otten more than one, exUa nnification for each path equivalence. Thus, tile algorithm is always expensive for path equivalences, regardless of whether they are contained inside disjuncttous or not.</Paragraph> <Paragraph position="3"> Tbe approach taken by Eisele & D0rre (1990) is similar to file approach taken in tills paper. They use 'nmned disjunction' (Kaplan & Maxwell 1989) and one of their central ideas i.e. to use a disjunction as the value of a variable to decide when the value is dependent on the choice in some disjunction is simiitu&quot; to the way of unifying variables in the present paper, ltowevcr, they use feature terms for represetmug the structures and their algorithm is described by a set of rewrite rules lot feature terms.</Paragraph> <Paragraph position="4"> This makes the algorithm different from algorittuns described for graph unification.</Paragraph> <Paragraph position="5"> What is special with the algorithm in the present paper is filat it is 1. As efficient as au algorithm not handling disjunction wlleu the participating structures do not contain any disjuuclions.</Paragraph> <Paragraph position="6"> 2. As efficient as an algorithm allowing only local disjunctions when the participating structures only contain such disjunction.</Paragraph> <Paragraph position="7"> 3. Expensive only when non-local disjunction is in- null volved.</Paragraph> <Paragraph position="8"> The description is given in a way that makes the algorithin easy to implement as an extensron of a graph unification algorithm.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 The Formulas </SectionTitle> <Paragraph position="0"> Feature structures are represented by fornlulas. The syntax of the formulas, especially the way of constructing complex graphs, is chosen so as to get a close relation to feature st~ uctmes. This also makes it easy to construct a unification procedure s~milar to ACTES DE COLING-92, NANTES. 23 28 AOt';r 1992 116 7 PROC. O1: COLING-92. NANTES, AUG. 23-28, 1992 graph unification and give the formulas a semantics based on graph models. For disjunction a generalization of Kaplan & Maxwell's (1989) 'named disjunction' is used. Their idea is to give the disjunctions names so that it is possible to restrict the choices in them. Kaplan and Maxwell use only binary disjunctions, and if the left alternative in one disjunction is chosen the left alternative in all disjunctions with the same name has to be chosen. In this paper I do not restrict the algorithm to binary disjunctions. Instead of giving the disjunction a name I give each alternative a name. Alternatives with the same name are then connected so that if one of them is chosen we also have to choose all the others.</Paragraph> <Paragraph position="1"> We assume four basic sets A, F, X and E of atoms, feature attributes, variables and disjunction switches respectively. These sets contain symbols denoted by strings. They are all assumed to be enumerable and pmrwise disjoint. From these basic sets we define the set S of feature structures. S contains the following structures: such that of,-~crj for i~j : disjunction A formula is defined to be a pair (s, v) where s is a feature structure and v:X-)S a valuation function that assigns structures to variables. We demand that the formulas are acyclic.</Paragraph> <Paragraph position="2"> An example of a formula is given in figure 1. Variables are denoted by using the symbol # and a number. The same formula is also given in matrix format which will be used to make the examples easier to read.</Paragraph> <Paragraph position="3"> (\[a: \[e:#1\],b:3,c:#l\], {(#1, \[d:4\])} Figure 1 We can observe that according to this definflion formulas are not unambiguously determined. The same formula can for example be expressed with different variables. There is also nothing said about the value of the valuation function v for variables not occurring in the formulas.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Semantics </SectionTitle> <Paragraph position="0"> The semantics given for these formulas is similar to the one given by Kasper & Rounds (1986) for their logic of feature structures. This logic is modified in the same way as in Reape (1991) to allow for the use of variables instead of equational coustraints as used by Kasper and Rounds. As Kasper and Rounds I will use a graph model for the formulas where each formula is satisfied by a set of graphs. I will use b to denote the transition function between nodes in the graph. We also need to define a valuation to describe the semantics of variables. Given a graph a valuation is a function V:X-->N. By this fnnction every variable is assigned a node in the graph as its value.</Paragraph> <Paragraph position="1"> Satisfaction is defined by the following rules. The model M = (G, V, L) where G is a graph, V a valuation and L a subset of the switches occurring in the formula.satisfies a formula at node i iff it fulfils any of these cases. 1 will use the notion sat(i) if node i in the graph satisfies a formula.</Paragraph> <Paragraph position="3"> These rules correspond to the usual sansfaction definitions for feature structures. The snbset of switches L forces us to choose exactly one alternative in each disjunction and the model should satisfy this alternative.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Unification </SectionTitle> <Paragraph position="0"> in this section I will define a set of rewrite rules for computing the unification of two formulas. 1 will start by inu'oducing the operator ^ into our formulas. The syntax and semantics is given by the following rules: * M sat(i)fst/,fs;~ ifffs I andfs 2 are formulas and M sat(i) fs I and M sat(i) fs 2</Paragraph> <Paragraph position="2"> The operator ^ can be viewed as the unification operator. By the definition we can see that it is inter+ preted as a conjunction or intersection of the two participating formulas, which is the normal interpretation of unification. The task of unifying two formulas is then the task of rewriting two tormulas containing ^ into a formula not containing A. Here we can note that since a formula is not unambiguously determined the unified formula is not unique. Actually there is a set of formulas that all have the AerE.s DE COLING-92, NANTES. 23-28 AOt~'r 1992 l 1 6 8 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 same model as the unification of the l&quot;ormulas. The aim here is to compute one of these formulas as a representative for this set, and thus a representative for the unification offs t and fs 2. The rewrite rules given below correspond to the unification algorithm for formulas not containing disjunction.</Paragraph> <Paragraph position="3"> 1, (s t, vl)A(s 2, v,~) ~ (slAs 2, v) if v I and v 2 are disjoint and v(x)=vl(x) for all x in v 1, v(x)=vHx) for all x in v),.</Paragraph> <Paragraph position="4"> 2. (~/~2, v)~(s2~ 1, v) 3. (T,~, v) ~ (s, v) 4. (aAa, v) r. (a, v) where aEA 5. (a/49, v) r. (+-, v) where ae:b and a,bEA 6. (a^lfl:Sl..~,:sn\], v) ~ (.1_, v) where a6 A 7. <_t~,, v> ~ <+-, v> 8. <xm, v> - <x, v~> where (v(x)^s, v) ~ (s t , Vl) and xeX, v;~(x)=s I and v2=v I for all other variables 9. (\[fll:Sll&quot;fln:Sln\]m\[f21:s21&quot;f2m:SemI, v)~ (s, v e ) where s is tim complex feature structure contaming: null fl,:suj for anyj such that fLr~fek for all k -~-~f)):s)i for anyj such that/2f,-eflt_ _ for all k f lj.'S3i for any j,k such that f u=f,~ t where (s ljAs2t,</Paragraph> <Paragraph position="6"> and i describes some enumerauon of the resulting formulas vo=v and <x3p, vp) is the last of tim formulas.</Paragraph> <Paragraph position="7"> The first rule is a kind of entry rule and can be interpreted as saying that it is possible to unify two formulas if the variables occumng within them are disjoint. &quot;l~le second rule says that unification is commutative, and are used to avoid duplicating the other rules. The next rule says that T unifies with everything. Rules four to six says that an atom only unifies with itself and becomes failure when unified with some other atom or a complex structnre. The seventh rule says that unifying failure always yields failure. The eighth rule deals with unification of variables. Here we have to start with unifying the value of the variable with the other saucmre. This unification gives a new pair of feature structure and valuation function as result where the new valuation function contains the changes of variables that have been made during this unification. The result of the unification of a variable is the pair of the variable and the new valuation function where the value of the variable is replaced with the unified one. Rule nine deals with the umfication of two complex feature structures and says that the result is the structure obtained by unifying the values of the common attributes of the two structures and then adding all atmbutes that occurs in either of the structures to the result.</Paragraph> <Paragraph position="8"> Figure 2 gives an example that illustrates what modifications that must be made to the rewrite rules to be able to handle unification of disjunction. Unifying a disjunction is basically unifying each of its alternatives. But the exmnple also shows what mnst</Paragraph> <Paragraph position="10"> happen if a variable occurs within the disjuncUon.</Paragraph> <Paragraph position="11"> The value of the variable is global sitice it can affect parts of the structure outside the disjunction. Therefore this value must be dependent on what alternative that is chosen m the disjunction. This is done by representing the value of the variable as a new disjunction where we only choose the unified value if the alternative o 7 is chosen, qb express this in the rewrite rule we index all rules by the list of switches that are Uaversed in the formula. This is expressed by replacing the m with __x in all rules where X is a list of the switches passed to reach this point of the unification. We also need to split rule 8 into two rules depending on if any disjunctions have been passed to reach the variable. The new rules are given below and we assume that the switches occurring in each formula are unique.</Paragraph> <Paragraph position="12"> 8.a(xm, v) ~0 (x, v~_:{'.st)) where (v(x)^s, v) ~ (s t, v 1) x~X, vHx)=s I and v2=v I for all other variables 8.b(x,~', v) ~ lot .... 'O(x, v~) , where (v(x)^s, v) ~ol ... om (si, Vl), xCX, vJx)={ol :l o2: I...\[ o~:sl o ..... :v(x)... I Onow2: v(x) } cr new I : v(x) }, v~ = v I for all other vari ables and Onewi is a switch name not used before.</Paragraph> <Paragraph position="13"> 10.({Ol:Sll...On:Sln}AS, v).,----X,~ ({Ol:S21...Crn:S2n }, Vn) where (Sli^S, v(i 1)) i~deglu'~ (s2i, vi) and v o =v In StrOmblick (1991, 1992) these rewrite rules are proved to compute the unification of two foimulas.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Discussion </SectionTitle> <Paragraph position="0"> The syntax and semantics of the formulas are very similar to what is given in Reape (1991 pp 35) which is a development of the semantics given in Kasper & Rounds (1986) that allows the use of vari-ACrEs DE COLING-92, NANa~2S, 23-28 XOt;r 1992 l 1 6 9 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992 ables to express equational constraints. The difference is that I use formulas of the form \[/l:sl...f,:sn\] instead of an ordinary conjunction and that we use named disjunction. This restricts the syntax of the formulas somewhat and makes them closer to ordinary feature structures. The restricted syntax is also the reason why we need to include a valuation function in the formulas.</Paragraph> <Paragraph position="1"> It is easy to represent the formulas as ordinary directed acyclic graphs where variables are represented as references to the same substructure in the graphs. If we think of the formulas as graphs it is also easy to compare the rewrite rules 1-9 above with an ordinary graph unification algorithm. Doing this we can conclude that each of the rewrite roles three to nine corresponds to a case in the unification algorithm. The only difference is that when variables are represented as reentrant subgraphs we never have to look-up the variable to find its value. The main advantage with defining unification by a set of rewrite rules is that the procedure can be proved to be correct.</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> 6 Detection of failure and </SectionTitle> <Paragraph position="0"> improvements The problem with the rewrite rules is that they sometimes produces formulas which have no model. Such formulas must be detected in order to know when the unification fails. As long as the formulas only contain local disjunction this is not a problem and it is easy to change the rewrite rules in order to propagate a failure to the top level in the formula. The ninth rule is, for example, changed to return (.+-, vp) whenever any of the values of the attributes in the resulting formula is fail.</Paragraph> <Paragraph position="1"> When nonlocal disjunction is included we must find some of keeping track of which choices of switches in the disjunctions that represent a failure. This can be done by building a tree where the paths represents possible choices of switches and the leaf nodes in the tree contains a value that is false if this choice represents a subset of switches for which the formula has no model and true otherwise. Figure 3 shows an example of a formula and its corresponding choice tree. To reach the leaf b in the tree the switches 0.1, 03, and crn have been chosen and or2, 0.4, and 03 have not. So 0.3 is both chosen and not chosen and the value of this leaf must be false. Continuing this reasoning for the other paths in the tree we could see that the leafs b, e, and f must have the value false and the other leafs must have the value true. If some value of an alternative is .1_ the corresponding leafs in the choice tree must be false. If we, for example assume that the value of or4 is fail we must assign false to the leafs c,f, and g.</Paragraph> <Paragraph position="2"> Choice trees can be built ones for each formula and merged during the unification of formulas. A better solution is to only build the choice trees when they are needed, i.e. when a disjunction alternatave</Paragraph> <Paragraph position="4"> 03 f:C/,&quot; true a 03. &quot;~n &quot;&quot;,~- false O ol ~-.</Paragraph> <Paragraph position="5"> /&quot; o4 &quot;~- true e / ~\ 03 ~&quot; trlte d ,,2 &quot;C~&quot;'~-*-I alse, .. 03 ._~- false f on&quot; J'~ true g Figure 3 where the disjunction shares some switch name with another disjunction fails. If this is done we only have to do the expensive work when really needed which is when we have failure in a non-local disjunction and achieves a better performance of the algorithm for all other cases.</Paragraph> <Paragraph position="6"> Str6mhiick (1991, 1992) discusses how the choice tree is best used. The papers also discuss how the choice tree can be used to remove failed alternatives from a formula without destroying the interpretation of the formula. The main idea here is to see what switches that must be chosen to reach each disjunction alternative in the formula. For this set of switches we find all leafs in the choice nee that can be reached if these switches are chosen. If all these leafs are false the alternative should be removed. For example, if we assume that the value of 0.4 in figure 3 is fail and that we have assigned false to the corresponding leafs in the choice tree, we can also see that there is no way of reaching a leaf with the value true if we have to choose tin. In this case we can as well remove both 04 and on from the feature StrUCture.</Paragraph> <Paragraph position="7"> The two papers mentioned above also discuss vmious improvements that can be made in order to get a more efficient algorithm. Most important here is that we can build only parts of the choice tree and that the notion of switches for a disjunction can be extended to allow sets of switches in order to avoid creating too many new disjunctions.</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> 7 Implementation </SectionTitle> <Paragraph position="0"> The algorithm has been implemented in Xerox Common Lisp and is running on the Sun Sparcstations. null ACTES DE COLING-92. NANTES, 23-28 AOt~'r 1992 1 l 7 0 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992</Paragraph> </Section> class="xml-element"></Paper>