XML Viewer - j94-2002

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/j94-2002_metho.xml
Size: 13,420 bytes
Last Modified: 2025-10-06 14:13:55
<?xml version="1.0" standalone="yes"?>
<Paper uid="J94-2002">
  <Title>Tree-Adjoining Grammar Parsing and Boolean Matrix Multiplication</Title>
  <Section position="5" start_page="184" end_page="186" type="metho">
    <SectionTitle>
6 E {i, k,j} and 1 &lt; h &lt; 3, denote integersfh(6), which indicate either positions within each
</SectionTitle>
    <Paragraph position="0"> single slice or components of nonterminal symbols labeling tree nodes.</Paragraph>
    <Paragraph position="1"> of matrix C. In this case also, the most significative digits associated with the retrieved indices are encoded within the nonterminal symbols of the auxiliary tree -y(u, v), while the two least significative digits are encoded by the position of the yield boundaries of the string derived from ~/(u, v) consistently with the input string w. To conclude our previous example, we see that if we apply the relations in Definition 4 to the derived tree at the bottom of Figure 7b, we get indices 2 and 7 of the only non-null element inC.</Paragraph>
    <Paragraph position="2"> The following result shows that any algorithm for the solution of a generic instance of TGP can be converted into an algorithm for the solution of the BMM problem, via the computation of maps 3 v and G. This concludes the present section.</Paragraph>
    <Paragraph position="3"> Lemma 1 Let/A, B / be an instance of BMM and let/G, w / = ~-(/A, B/). Let also Rp be the parse relation that solves/G~ w/. Then we have</Paragraph>
    <Paragraph position="5"> Assume that m is the order of the matrices A and B, n is the natural number associated with m as in Definition 3, and C/ = n + 1. Let C = A x B and C / = G(Rp).</Paragraph>
    <Paragraph position="6"> To prove cij = 1 implies clj = 1, we go through a sentential derivation of w in G and then apply the definition of G. If Cq = 1, then there exists k, 1 &lt; k &lt; m, such that aik = bkj = 1. Let &amp;quot;Yl and &amp;quot;y2 be the unique auxiliary trees in G associated by map 3 v  foot node) labeled by nonterminal (A,fl(i),fl(k)); furthermore, the terminal symbols in the yield of &amp;quot;Yl are (from left to right) d/B(i ), dC/+/B(k ), d4,+/a(k)+l and d5c,+f2(i). The only pair of substrings of w that ~'1 can derive, by means of zero or more adjunctions of trees in F~ n) and P(')6 , is (f3(i)Wo-+f3(k)~ 4a+f2(k)+1W5o-+f2(i)) * Call ~ a parse tree associated with such a derivation (see Figure 8). In a similar way, auxiliary tree &amp;quot;Y2 has root labeled by nonterminal (B,fl(k);fl(j)) and derives pair {o- +f3(k )+ l W2o&amp;quot; +f3(j) ~ 3cr +f2(j) W4cr +f2(k ) } of substrings of w. Call 3~ a parse tree associated with the derivation (see again Figure 8).</Paragraph>
    <Paragraph position="7"> According to step (iii) in Definition 3, grammar G also includes auxiliary trees &amp;quot;y3 = 7(f1(i)~fl (k),fl(j)) E p~n) and &amp;quot;/4 = 70Cl(/);Aq)) C r~n). Note that the yields of trees 7~ and 7 / 2 are exactly nested within w; moreover, the root (and the foot) nodes of 71 and 72 have been preserved in the derivation. Therefore 3'I and V~ can be adjoined into 73 and the resulting tree 7~ can in turn be adjoined into 74. In this way, 74 derives the pair of substrings of w (f3 (i)W2cr +f3 (j), 3C/ q-f2 (j) W5~r q-f2 (i) } &amp;quot; Call 7~ the resulting derived tree (see Figure 9). Since derived tree 71 can be adjoined into % in G and a tree can be eventually derived for the input string w, we have Re(v4,f3(i), 2e +f3(j), 3or +f2(j), 5~ +f2(i)), and from the definition of ~ we get c;j = 1.</Paragraph>
    <Section position="1" start_page="186" end_page="186" type="sub_section">
      <SectionTitle>
Giorgio Satta Tree-Adjoining Grammar Parsing
</SectionTitle>
      <Paragraph position="0"> Assuming clj = 1, we now prove cij = 1; this is done by arguing that the only sentential derivations for w that are allowed by G are those of the kind outlined above. From the definition of G we have that Rp(%,f3(i), 20 +f3 (j), 30 +f2(j), 50 +f2(i)) holds for the auxiliary tree 3'4 = 3'0cl (i),fl(j)) E F~ n). Equivalently, there exists at least one derivation from '/4 of strings</Paragraph>
      <Paragraph position="2"> that participates in a sentential derivation of w. Fix such a derivation.</Paragraph>
      <Paragraph position="3"> We first observe that, in order to derive any terminal symbol from '/4, auxiliary trees in F~ n), F~ n) and F~ n) must be used. Any tree in F~ n) can only derive symbols in slices w (h), h E {1,2, 5, 6}, and any tree in F~ n) can only derive symbols in slices w (h), h E {2, 3, 4, 5}. Therefore at least one tree in F~ n) and at least one tree in F~ n) must be used in the derivation of (x, y), since (x, y) includes terminal symbols from every slice of w. Furthermore, if more than one tree in F~ &amp;quot;) is used in a derivation in G, the resulting string cannot match w. The same argument applies to trees in F~ n).</Paragraph>
      <Paragraph position="4"> We must then conclude that exactly one tree in F~ &amp;quot;), one tree in F~ &amp;quot;), and one tree in F~ n) have been used in the derivation of (x, y) from 74. Call the above trees 71 = q'(p, k3, k2+1, s, u, kl) E F~ n), &amp;quot;/2 ~--&amp;quot; 7(k~+1, q, r, k~, k~, v) E F~ n), and &amp;quot;/3 = &amp;quot;/( u/, t, V') E F~ n). As a second step, we observe that 73 can be adjoined into 74 only if u' =fl(i) and v' = A q) and 3'3 can host 71 and &amp;quot;/2 just in case u' = u, v' = v, and k I = k~ = t. We also observe that, after these adjunctions take place, the leftmost terminal symbol in the yield of 3'4 will be the leftmost terminal symbol in the yield of 71, that is dp. From relation (2) we then conclude that p =f3(i). Similarly, we can argue that q = 20 +f3(j), r = 3o +f2(j) and s = 50 +f2(i). Finally, adjunction of '/1 and &amp;quot;/2 into &amp;quot;/3 can match w just in case k 3 = k~ and k2 = k~.</Paragraph>
      <Paragraph position="5"> From the relations inferred above, we conclude that we can rewrite 71 as 70c3(i), k3, k2 + 1,5o +f2 (i),fl (i), kl) C F~ n) and '/2 as '/(k3 + 1, 20 +f3 (/'), 3o +f2(j), k2, kl,fl (j)) E F~ n) . Sincef is one-to-one and k2, k3 c {1..n}, there exists k such thatf(k) = (kl, k2, k3). From steps (i) and (ii) in Definition 3, we then have that aik and bkj are non-null and then cq = l. \[\]</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="186" end_page="188" type="metho">
    <SectionTitle>
4. Computational Consequences
</SectionTitle>
    <Paragraph position="0"> The results presented in the previous section are developed here under a computational perspective. Some interesting computational consequences will then be drawn for the tree-adjoining grammar parsing problem. The following analysis assumes the random-access machine as the model of computation.</Paragraph>
    <Section position="1" start_page="186" end_page="187" type="sub_section">
      <SectionTitle>
4.1 Transferring of Time Upper Bounds
</SectionTitle>
      <Paragraph position="0"> We show in the following how time upper bounds for the TGP problem can be transferred to time upper bounds for the BMM problem using the commutative diagram studied in the previous section.</Paragraph>
      <Paragraph position="1"> Let (A, B) be an instance of BMM and let (G, w} = .T((A, B)); m and n are specified as in Definition 3. Observe that, since n 6 &gt; m, functionf (n) maps set {1..m} into product set {1..n 4} x {1..n} x {1..n}, in other words we have i &lt;_fl(i) &lt;_ n 4 and i &lt;f2(i),f3(i) &lt; n  Computational Linguistics Volume 20, Number 2 for 1 &lt; i &lt; m. From the definition of ~', we see that G contains O(m 2) auxiliary trees p(n) F~n) and p~n). This determines the size of G and we have from each of the classes -1 , I(G,w)\] = O(/t/2), since \]w\] = O(n). Each auxiliary tree introduced in G at steps (i) and (ii) of Definition 3 requires the computation of a constant number of instances of functionf (n) on some integer i, 1 ( i ( m. Such a computation can be carried out in an amount of time O(log2(m)) using standard algorithms for integer division. Summing up, the entire computation of ~&amp;quot; on an instance (A, B) takes time O(m 2 log2(m)). Let Rp be the parse relation that solves (G, w) = ~'((A, B)). From Definition 1 and the above observations we have that \]Rp\] = O(m2n4), that is \]Re\] = O(m2+~). We can compute C = G(Rp) in the following way. For every element cij we compute f(n)(i) and f(n)(j) and then check Rp according to Definition 4. (Recall also our assumption that an instance of Rp can be tested in constant time.) Again we find that the entire computation takes an amount of time O (m 2 log 2 (m)). We observe that the computation of ~r and G takes an amount of time (asymptotically) very close to the one needed to store (A,B) or C.</Paragraph>
      <Paragraph position="2"> As a consequence of the above discussion and of Lemma 1, we have that any time upper bound for the TGP problem can be transferred to an upper bound for the BMM problem, down to the time needed for the computation of transformations ~- and ~.</Paragraph>
      <Paragraph position="3"> The following statement gives an example.</Paragraph>
      <Paragraph position="4"> Theorem 1 Let Ap be an algorithm for the solution of the TGP problem having running time O(\[GlPlwlq). Then any instance of BMM can be solved in time O(max{m 2p+q, m 2 log2(m)}), where m is the order of the input matrices.</Paragraph>
      <Paragraph position="5"> Proof From Lemma 1 and from the previous discussion we have that two Boolean matrices of order m can be multiplied in time O(I G IPl w I q + m 2 log2(m)), where \]G I = O(m 2) and Iwl = O(m~). \[\] Observe that, according to our definition, the TGP problem has a trivial time lower bound O(IR p I), since this is the amount of time needed in the worst case to store a representation for Rp that can be accessed in constant time. In practice this means that the upper bound transfer stated by the above result is effective down to O(m 2+2 ).</Paragraph>
    </Section>
    <Section position="2" start_page="187" end_page="188" type="sub_section">
      <SectionTitle>
4.2 Time Upper Bounds for TGP
</SectionTitle>
      <Paragraph position="0"> In previous sections we have related the complexity of tree-adjoining grammar parsing to the complexity of Boolean matrix multiplication. Here we speculate on the consequences of the presented result.</Paragraph>
      <Paragraph position="1"> As a computational problem, Boolean matrix multiplication has been an object of investigation for many years. Researchers have tried to improve the well-known O(m 3) time upper bound, m the order of the input matrices, and methods were found that work asymptotically faster than the standard cubic time algorithm. Strassen's divide and conquer algorithm that runs in time O(m 2sl) (see for instance Cormen, Leiserson, and Rivest \[1990\]) has been the first one in the series, and the best time upper bound known to date is approximately 0(m2&amp;quot;376), as reported in Coppersmith and Winograd (1990). It is worth noting here that the closer researchers have come to the O(m 2) trivial time lower bound, the more complex the computation involved in these methods has become. In fact, if Strassen's algorithm outperforms the O(m 3) standard algorithm only  for input matrices of order greater than 45 or so (see again Cormen, Leiserson, and Rivest \[1990\]), recently discovered methods that are asymptotically faster are definitely prohibitive, given current computer hardware. At present, no straightforward method is known for Boolean matrix multiplication that considerably improves the cubic upper bound and that can be used in practical cases. Also, there is enough evidence that, if such a method exists, its discovery should be a very difficult enterprise.</Paragraph>
      <Paragraph position="2"> Let us now turn to the TAG parsing problem. Many algorithms have been proposed for its solution and an O(I G III U A I\] w 16) time upper bound has been given in the literature; see for instance Schabes (1990). We remark here that the dependency on the grammar size can be further improved using techniques similar to the one proposed in Graham, Harrison, and Ruzzo (1980) for the context-free grammar recognition/parsing problem: this results in an O(I G II w 16) time upper bound for the general case. Theorem 1 can be used to transfer this upper bound to an upper bound for Boolean matrix multiplication, finding the already mentioned O(m 3) result.</Paragraph>
      <Paragraph position="3"> More interestingly, Theorem 1 implies that any method for the solution of the tree-adjoining grammar parsing problem having running time O(I G II w I s) will give us a method for Boolean matrix multiplication having running time O(m2&amp;quot;83). Likewise, any O(I G I\]wl 4) time method for the former problem will result in an O(m 2'~) time method for Boolean matrix multiplication. Even if the involved constants hidden in the studied construction are large, the resulting methods will still be competitive with known methods for Boolean matrix multiplication that improve the cubic time upper bound. We conclude then that the TAG parsing problem should also be considered as having the status of a problem that is &amp;quot;difficult&amp;quot; to improve, and we have enough evidence to think that methods for TAG parsing that are asymptotically faster than O(I G II w I 6) are unlikely to be practical, i.e., will involve rather complex computations.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML