File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/97/w97-1509_evalu.xml

Size: 3,102 bytes

Last Modified: 2025-10-06 14:00:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1509">
  <Title>Head-Driven Generation and Indexing in ALE</Title>
  <Section position="8" start_page="67" end_page="68" type="evalu">
    <SectionTitle>
7 Results and Future Work
</SectionTitle>
    <Paragraph position="0"> Compilation of control code for head-driven generation, as outlined in Section 4, improves generation performance by a factor of about 5 on three feature-based grammars we have written and tested. The use of our indexing code independently improves generation speed by a factor of roughly 3. The combined compile-time cost for producing and compiling the control and indexing code is a factor of about 1.5. Taken as a function of maximum chain length (also declared by the user), generation is, of course, always slower with larger maxima; but performance degrades somewhat more rapidly with indexed generation than with non-indexed, and more rapidly still with compiled generation than with interpreted. In our experience, the factor of improvement decreases no worse than logarithmically with respect to maximum chain length in either case.</Paragraph>
    <Paragraph position="1"> There are several directions in which our approach could be improved. The most important is the use of a better decision-tree growing method such as impurity-based classification ((Qui83; Utg88;  Cho91)) or concept clustering over lexical entries ((CR92)). Our current approach only guarantees that semantics-related paths are favoured over unrelated ones, and reduces redundant unifications when compared with naive lookup in a table of feature structures. What is needed is a arrangement of nodes which minimizes the average length of traversal to a failed match, in order to prune search as soon as possible. For generation with fixed large-scale grammars, this could also involve a training phase over a corpus to refine the cost estimate based on a lexical entry's frequency. This direction is pursued further in (Pen97).</Paragraph>
    <Paragraph position="2"> One could also explore the use of memoization for generation, to avoid regeneration of substrings, such as the &amp;quot;chart-based&amp;quot; generator of (Shi88), which was originally designed for a bottom-up generator. The best kind of memoization for a semantically driven generator would be one in which a substring could be reused at any position of the final string, possibly by indexing semantics values which could be checked for subsumption against later goals.</Paragraph>
    <Paragraph position="3"> Another direction is the incorporation of this strategy into a typed feature-based abstract machine, such as the ones proposed in (Qu94; Win96).</Paragraph>
    <Paragraph position="4"> Abstract machines allow direct access to pointers and stack and heap structures, which can be used to make the processing outlined here even more efficient, at both compile-time and run-time. They can also be used to perform smarter incremental compilation, which is very important for large-scale grammar development. This direction is also considered in (Pen97).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML