File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-1514_metho.xml

Size: 3,520 bytes

Last Modified: 2025-10-06 14:10:43

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1514">
  <Title>Generating XTAG Parsers from Algebraic Specifications[?]</Title>
  <Section position="4" start_page="106" end_page="106" type="metho">
    <SectionTitle>
3 Comparing several parsers for the
XTAG grammar
</SectionTitle>
    <Paragraph position="0"> In this section, we make a comparison of several different TAG parsing algorithms -- the CYK-based algorithm described at (Vijay-Shanker and Joshi, 1985), Earley-based algorithms with (Alonso et al., 1999) and without (Schabes, 1994) the valid prefix property (VPP), and Nederhof's algorithm (Nederhof, 1999) -- on the XTAG English grammar (release 2.24.2001), by using our system and the ideas we have explained. The schemata for these algorithms without unification support can be found at (Alonso et al., 1999).</Paragraph>
    <Paragraph position="1"> These schemata were extended as described in the previous sections, and used as input to our system which generated their corresponding parsers.</Paragraph>
    <Paragraph position="2"> These parsers were then run on the test sentences shown in table 2, obtaining the performance measures (in terms of runtime and amount of items generated) that can be seen in table 3. Note that the sentences are ordered by minimal runtime.</Paragraph>
    <Paragraph position="3"> As we can see, the execution times are not as good as the ones we would obtain if we used Sarkar's XTAG distribution parser written in C (Sarkar, 2000). This is not surprising, since our parsers have been generated by a generic tool without knowledge of the grammar, while the XTAG parser has been designed specifically for optimal performance in this grammar and uses additional information (such as tree usage frequency data from several corpora, see (XTAG, 2001)).</Paragraph>
    <Paragraph position="4"> However, our comparison allows us to draw conclusions about which parsing algorithms are better suited for the XTAG grammar. In terms of memory usage, CYK is the clear winner, since it clearly generates less items than the other algorithms, and a CYK item doesn't take up more memory than an Earley item.</Paragraph>
    <Paragraph position="5"> On the other hand, if we compare execution times, there is not a single best algorithm, since the performance results depend on the size and complexity of the sentences. The Earley-based algorithm with the VPP is the fastest for the first, &amp;quot;easier&amp;quot; sentences, but CYK gives the best results for the more complex sentences. In the middle of the two, there are some sentences where the best performance is achieved by the variant of Earley that doesn't verify the valid prefix property. Therefore, in practical cases, we should take into account the most likely kind of sentences that will be passed to the parser in order to select the best algorithm.</Paragraph>
    <Paragraph position="6"> Nederhof's algorithm is always the one with the slowest execution time, in spite of being an improvement of the VPP Earley parser that reduces worst-case time complexity. This is probably because, when extending the Nederhof schema in order to support feature structure unification, we get a schema that needs more unification operations than Earley's and has to use items that store several feature structures. Nederhof's algorithm would probably perform better in relation to the others if we had used the strategy of parsing without feature structures and then performing unification on the output parse forest.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML