File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/89/j89-4001_intro.xml

Size: 8,961 bytes

Last Modified: 2025-10-06 14:04:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="J89-4001">
  <Title>A PARSING ALGORITHM FOR UNIFICATION GRAMMAR</Title>
  <Section position="4" start_page="0" end_page="0" type="intro">
    <SectionTitle>
3 OPERATIONS ON SETS OF RULES AND TERMS
</SectionTitle>
    <Paragraph position="0"> The parser must find the set of ground terms that derive the input string and check whether the start symbol is one of them. We have taken the rules of a unification grammar as an abbreviation for the set of all their ground instances. In the same way, the parser will use sets of terms and rules containing variables as a representation for sets of ground terms and ground rules. In this section we show how various functions needed for parsing can be computed using this representation.</Paragraph>
    <Paragraph position="1"> A grammatical expression, or g-expression, is either a term of L, the special symbol nil, or a pair of g-expressions. The letters u, v, w, x, y, and z denote g-expressions, and X, Y, and Z denote sets of gexpressions. We use the usual LISP functions and predicates to describe g-expressions. \[x y\] is another notation for cons (x,y). For any substitution s, s (cons (x,y)) = cons (s(x),s(y)) and s(Nil) = Nil. A selector is a fianction from g-expressions to g-expressions formed by composition from the functions car, cdr, and identity.</Paragraph>
    <Paragraph position="2"> Thus a selector picks out a subexpression from a g-expression. A constructor is a function that maps two g-expressions to a g-expression, formed by composition firom the functions cons, car, cdr, nil, (A x y. x), and (A x y. y). A constructor builds a new g-expression from parts of two given g-expressions. A g-predicate is a function from g-expressions to Booleans formed by composition from the basic functions car, cdr, (A x. x), consP, and null.</Paragraph>
    <Paragraph position="3"> Let ground(X) be the set of ground instances of g-expressions in X. Iff is a selector function, let fiX) be the set of all fix) such that x E X. If p is a g-predicate, let separate (p,x) be the set of all x E X such that p(x). The following lemmas are easily established from the definition of s(x) for a g-expression x.</Paragraph>
    <Paragraph position="4">  Lemma 2.2. Iffis a selector function,f (ground(X)) = ground (f(x)).</Paragraph>
    <Paragraph position="5"> Lemma 2.3. If p is a g-predicate, separate (p,ground (X)) = ground(separate (p,x)).</Paragraph>
    <Paragraph position="6"> Lemma 2.4. Ground (X U I&amp;quot;) = ground (X) U ground (I1).</Paragraph>
    <Paragraph position="7"> Lemma 2.5. Ifx is a ground term, x E ground(X) iffx is an instance of some y E X.</Paragraph>
    <Paragraph position="8"> Lemma 2.6. Ground (X) is empty iff X is empty.</Paragraph>
    <Paragraph position="9"> Proof. A nonempty set of terms must have a non null empty set of ground instances, because every variable belongs to a sort and every sort includes at least one grotmd term.</Paragraph>
    <Paragraph position="10"> These lemmas tell us that if we use sets X and Y of terms to represent the sets ground(X) and ground(Y) of grotmd terms, we can easily construct representations for ./(ground(x)), separate(p,ground (X)), and ground (X) U ground (Y). Also we can decide whether a given ground term is contained in ground(X) and whether ground(X) is empty. All these operations will be needed in the parser.</Paragraph>
    <Paragraph position="11"> The parser requires one more type of operation, defined as follows.</Paragraph>
    <Paragraph position="12"> Definition. Letf l andf 2 be selectors and g a constructor, and suppose g(x,y) is well defined whenever fl(x) andJ2(y) are well defined. The symbolic product defined by j~, f2, and g is the function (AX Y. { g(x,y) I x E X A y E Y A f,(x) = f2(Y) }) where X and Y range over sets of ground g-expressions. Note thatfl(x) = f2(Y) is considered false if either side of the equation is undefined.</Paragraph>
    <Paragraph position="13"> The symbolic product matches every x in X against every y in Y. If fl(x) equals f2(Y), it builds a new structure from x and y using the function g. As an example, suppose X and Y are sets of pairs of ground terms, and we need to find all pairs \[A C\] such that for some B, \[A B\] is in X and \[B C\] is in Y. We can do this by finding the symbolic product withfl = cdr, f2 = car, and g = (A x y. cons(car(x), cdr(y))). To see that this is correct, notice that if \[A B\] is in X and \[B C\] is in Y, then 222 Computational Linguistics, Volume 15, Number 4, December 1989 Andrew Haas A Parsing Algorithm for Unification Grammar f~(\[A B\]) =f2 (\[B C\]), so the pairg (\[A B\],\[B C\]) = \[A C\] must be in the answer set.</Paragraph>
    <Paragraph position="14"> A second example: we can find the intersection of two sets of terms by using a symbolic product withfl = (A X . X), f2 = ()t X . X), and g = (A x y. x).</Paragraph>
    <Paragraph position="15"> If X is a set of g-expressions and n an integer, rename(X,n) is an alphabetic variant of X. For all X, Y, m, and n, if m # n then rename(X,n) and rename(Y,m) have no variables in common. The following theorem tells us that if we use sets of terms X and Y to represent the sets ground(X) and ground(Y) of ground terms, we can use unification to compute any symbolic product of ground(X) and ground(Y). We assume the basic facts about unification as in Robinson (1965).</Paragraph>
    <Paragraph position="16"> Theorem 2.1. If h is the symbolic product defined by f~, f2 and g, and X and Y are sets of g-expressions, then</Paragraph>
    <Paragraph position="18"> Proof. The first step is to show that if Z and W share no variables (1) {g(z,w) I z E ground(Z)/k w E ground(W)/~ fl(z)  = t&amp;quot;2 (w)} = ground({s(g(u,v)) I u E Z/~ v ~ W/~ s is the m.g.u, of fl(u) and f2(v) }) Consider any element of the right side of equation (1). It must be a ground instance of s(g(u,v)), where u E Z, v E W, and s is the m.g.u, offl(u ) andfz(v ). Any ground instance of s(g(u,v)) can be written as s'(s(g(u,v))), where s' is chosen so that s'(s(u)) and s'(s(v)) are ground terms. Then s'(s(g(u,v))) = g(s'(s(u)),s'(s(v))) and</Paragraph>
    <Paragraph position="20"> Therefore s'(s(g(u,v))) belongs to the set on the left side of equation (1).</Paragraph>
    <Paragraph position="21"> Next consider any element of the left side of (I). It must have the form g(z,w), where z E ground(Z), w E ground(W), and fl (z) = fz (w). Then for some u E Z and v E W, z is a ground instance of u and w is a ground instance of v. Since u and v share no variables, there is a substitution s' such that s'(u) = z and s'(v) = w. Then s'(f l (u)) = fl (s'(u)) = f2 (s'(v)) = s'0C2 (V)), SO there exists a most general unifier s forfl (u) andfz (v), and s' is the composition of s and some substitution s&amp;quot;. Then</Paragraph>
    <Paragraph position="23"> ground term because z and w are ground terms, so g(z,w) is a ground instance of s(g(u,v)) and therefore belongs to the set on the right side of equation (1).</Paragraph>
    <Paragraph position="24"> We have proved that if Z and W share no variables, (2) h(ground(Z),ground(W)) = ground({s(g(u,v)) I u E Z/~ v E W/~ s is the m.g.u, of fl(u) and f2(v)}) For any X and Y, rename(X, I) and rename(Y,2) share no variables. Then we can let Z = rename(X,1) and W = rename(Y,2) in formula (2). Since h(ground(X), ground(Y)) = h(ground(rename(X, 1)), ground(rename (Y,2))), the theorem follows by transitivity of equality. This completes the proof.Fq As an example, suppose X = {\[a(F) b(F)\]} and Y = {\[b(G) c(G)\]}. Suppose the variables F and G belong to a sort s that includes just two ground terms, m and n.</Paragraph>
    <Paragraph position="25"> We wish to compute the symbolic product of ground(X) and ground(Y), usingfl = cdr, f2 = car, and g = (A x y.</Paragraph>
    <Paragraph position="26"> cons(car(x), cdr(y))) (as in our previous example).</Paragraph>
    <Paragraph position="27"> ground(X) equals {\[a(m) b(m)\],\[a(n) b(n)\]} and ground(Y) equals {\[b(m) c(m)\],\[b(n) c(n)\]}, so the symbolic product is {\[a(m) c(m)\],\[a(n) c(n)\]}. We will verify that the unification method gets the same result. Since X and Y share no variables, we can skip the renaming step. Let x = \[a(F) b(F)\] and y = \[b(G) c(G)\]. Thenf 1 (x) = b(PO, f2 (Y) = b(G), and the most general unifier is the substitution s that replaces F with G. Then g(x,y) = \[a(F) c(G)\] and s(g(x,y)) = \[A(G) C(G)\]. The set of ground instances of this g-expression is {\[A(m) C(m)\], \[A(n)C(n)\]}, as desired.</Paragraph>
    <Paragraph position="28"> Definition. Let f be a function from sets of g-expressions to sets of g-expressions, and suppose that when X C_ X' and Y C_ Y',f(X,Y) C_ f(X',Y'). Thenfis monotonic.</Paragraph>
    <Paragraph position="29"> All symbolic products are monotonic functions, as the reader can easily show from the definition of symbolic products. Indeed, every function in the parser that returns a set of g-expressions is monotonic.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML