File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/94/j94-2002_concl.xml

Size: 5,574 bytes

Last Modified: 2025-10-06 13:57:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="J94-2002">
  <Title>Tree-Adjoining Grammar Parsing and Boolean Matrix Multiplication</Title>
  <Section position="7" start_page="188" end_page="189" type="concl">
    <SectionTitle>
5. Remarks and Conclusion
</SectionTitle>
    <Paragraph position="0"> Polynomial time reductions between decision/search problems are commonly used in providing hardness results for complexity classes not known to be included in P (P is the class of all languages decidable in deterministic polynomial time). We have studied here a polynomial time reduction between Boolean matrix multiplication and TAG parsing, two problems already known to be in P. However, the choice of the mapping allows one to transfer upper bounds from the first problem to the other. In this way TAG parsing inherits from Boolean matrix multiplication the reputation of being a problem tough to improve. We comment in the following on the significance of this result.</Paragraph>
    <Paragraph position="1"> As already discussed, the notion of the parse forest is an informal one, and there is no common agreement on which specifications such a structure should meet. The obtained results are based on the assumption that a parsing algorithm for TAG should be able to provide a representation for a parse forest such that instances of the parse relation Rp in Definition I can be retrieved in constant time. Whatever the specifications of the output parse forest structure will be, it seems quite reasonable to require that an explicit representation of relation Rp can be extracted from the output in linear time with respect to the size of the output itself, therefore without affecting the overall running time of the method. This requirement is satisfied by all TAG parsers that have been presented to date in the literature.</Paragraph>
    <Paragraph position="2"> As a second point, the studied construction provides an interesting insight into the structure of the TAG parsing problem. We see for instance that the major source of  Computational Linguistics Volume 20, Number 2 complexity derives from cases of properly nested adjunction operations. Such cases are responsible for a bounded amount of nondeterminism in the computation: to detect how a string divides into subparts according to the adjunction of a derived tree into another, we have to consider many possibilities in general, as much as we do to detect a non-null element within a product Boolean matrix. A closer look at the studied construction reveals also that the parsing problem for linear TAG does not seem easier than the general case, since ~ maps instances of BMM to instances of TGP restricted to such a class (a linear TAG is a TAG whose elementary trees allow adjunction only into nodes along a single spine). This contrasts with the related case of context-free grammar parsing, where the restriction of the problem to linear grammars can be solved in time O(I G II w 12) but no method is known for the general case working with this bound. As expected from our result, the techniques that are used for linear context-free grammar parsing cannot be easily generalized to improve the parsing problem for linear TAGs with respect to the general case.</Paragraph>
    <Paragraph position="3"> Finally, we want to discuss here an interesting extension of the studied result.</Paragraph>
    <Paragraph position="4"> The TAG parsing problem can be generalized to cases in which the input is a lattice representation of a string of terminal symbols along with a partially specified parse relation associated with it. This has many applications for ill-formed input and error-correcting parsing. The TAG lattice parsing problem can still be solved in O(I G\[I w I 6) time: the general parsing method provided in Lang (1992) can be used to this purpose, and already known tabular methods for TAG parsing can be easily adapted as well.</Paragraph>
    <Paragraph position="5"> Without giving the technical details of the argument, we sketch here how Boolean matrix multiplication can be related to TAG lattice parsing. For order m matrices, one can use an encoding function f/(n), where n = \[ml/2j + 1, mapping set {1..m} into product set {1..n} x {1..n}. This allows a direct encoding of any instance (A, B) of the BMM problem into a word lattice wl consisting of 6(n + 1) nodes and O(m 2) arcs, where some arcs involve four nodes and represent a derived tree corresponding to a non-null element in either A or B. Then we can use a grammar G in the target instance of the TAG lattice problem that is defined independently of (A, B) and therefore has constant size. (Such a grammar can be obtained from families F~ n) and F~ n) defined in Section 3 by deleting the integer components in each nonterminal symbol.) The construction obtained in this way relates therefore the BMM problem to the fixed grammar parsing problem, and provides a result even stronger than the one presented in Theorem 1. We have in fact that any algorithm for TAG lattice parsing having running time O(IG IP\[w I q) can be converted into an algorithm for Boolean matrix multiplication running in time O(max{m~, malog2(m)}), independently of p. As an example, O(I G IPl w 14) for TAG lattice parsing becomes O(m 2 log2(m)) for matrix multiplication, for any p. Since many tabular methods for TAG parsing can be easily extended to TAG lattice parsing, this means that the chances of getting an 0(I G IPlw 14) time upper bound for the TAG parsing problem itself by means of these techniques are really small.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML