File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/a00-2022_metho.xml

Size: 18,306 bytes

Last Modified: 2025-10-06 14:07:02

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-2022">
  <Title>O~ Proand Retroactive Packing I \] o passive edges \</Title>
  <Section position="5" start_page="163" end_page="165" type="metho">
    <SectionTitle>
3 Ambiguity Packing in the Parser
</SectionTitle>
    <Paragraph position="0"> Moore and Alshawi (1992) and Carroll (1993) have investigated local ambiguity packing for unification grammars with CF backbones, using CF category equality and feature structure subsumption to test if a newly derived constituent can be packed. If a new constituent is equivalent to or subsumed by an existing constituent, then it can be packed into the existing one and will take no further part in processing. However, if the new constituent subsumes an existing one, the situation is not so straightfor- null ward: either (a) no packing takes place and the new constituent forms a separate edge (Carroll, 1993), or (b) previous processing involving the old constituent is undone or invalidated, and it is packed into the new one (Moore &amp; Alshawi, 1992; however, it is un- null clear whether they achieve maximal compactness in practice: see Table 1). In the former case the parse forest produced will not be optimally compact; in the latter it will be, but maintaining chart consistency and parser correctness becomes a non-trivial problem. Packing of a new edge into an existing one we call proactive (or forward) packing; for the more complex situation involving a new edge subsuming an existing one we introduce the term retroactive (or backward) packing.</Paragraph>
    <Paragraph position="1"> Several issues arise when packing an old edge (old) into one that was newly derived (new) retroactively: (i) everything derived from old (called derivatives of old in the following) must be invalidated and excluded from further processing (as new is known to generate more general derivatives); and (ii) all pending computation involving old and its derivatives has to be blocked efficiently. Derivatives of old that are invalidated because of retroactive packing may already contain packed analyses, however, which still represent valid ambiguity. These need to be repacked into corresponding derivatives of new when those become available. In turn, derivatives of old may have been packed already, such that they need not be available in the chart for subsequent subsumption tests. Therefore, the parser cannot simply delete everything derived from old when it is packed; instead, derivatives must be preserved (but blocked)  procedure block(edge, mark) if (edge.frozen = false or mark = freeze) then edge.frozen +- mark; fi for each parent in edge.parents do block(parent, freeze); od</Paragraph>
    <Paragraph position="3"> {passive edges with same span} { test category subsumption} { equivalent or proactive packing} {pack 'new' into 'old'} {return to caller; signal success} {retroactive packing} {raise all packings into new host} if (old.frozen = false) then new.packed e- (old I new.packed); fi {pack 'old' into 'new'} block(old, frost); {frost 'old' and freeze derivatives} delete(old, chart); {remove 'old' from the chart}</Paragraph>
    <Paragraph position="5"> until the derivations have been recomputed on the basis of new. 5 As new is equivalent to or more general than old it is guaranteed to derive at least the same set of edges; furthermore, the derivatives of new will again be equivalent to or more general than the corresponding edges derived from old.</Paragraph>
    <Paragraph position="6"> The procedure packed-edge-p(), sketched in Figure 2, achieves pro- and retroactive packing without significant overhead in the parser; the algorithm can be integrated with arbitrary bottom-up (chartbased) parsing strategies. The interface assumes that the parser calls packed-edge-pO on each new edge new as it is derived; a return value of true indicates that new was packed proactively and requires no further processing. Conversely, a false return value from packed-edge-p 0 signals that new should subsequently undergo regular processing. The second part of the interface builds on notions we call frosting and freezing, meaning temporary and permament invalidation of edges, respectively. As a side-effect of calls to packed-edge-p(), a new edge can cause retroactive packing, resulting in the dele5The situation is simpler in the CLE parser (Moore &amp; Alshawl, 1992) because constituents and dominance relations are separated in the chart. The CLE encoding, in fact, does not record the actual daughters used in building a phrase (e.g. as unique references or pointers, as we do), but instead preserves the category information (i.e. a description) of those daughters. Hence, in extracting complete parses from the chart, the CLE has to perform (a limited) search with re-unification of categories; in this respect, the CLE parse forest still is an underspecified representation of the set of analyses, whereas our encoding (see below) facilitates unpacking without extra search.</Paragraph>
    <Paragraph position="7"> tion of one or more existing edges from the chart and blocking of derivatives. Whenever the parser accesses the chart (i.e. in trying to combine edges) or retrieves a task from the agenda, it is expected to ignore all edges and parser tasks involving such edges that have a non-null 'frozen' value. When an existing edge old is packed retroactively, it is frosted and ignored by the parser; as old now represents local ambiguity, it still has to be taken into account when the parse forest is unpacked. Derivatives of old, on the other hand, need to be invalidated in both further parsing and later unpacking, since they would otherwise give rise to spurious analyses; accordingly, such derivatives are frozen permanently.</Paragraph>
    <Paragraph position="8"> Frosting and freezing is done in the subsidiary procedure block () that walks up the parent link recursively, storing a mark into the 'frozen' slot of edges that distinguishes between temporary frosting (in the top-level call) and permanent freezing (in recursire calls).</Paragraph>
    <Paragraph position="9"> For a newly derived edge new, packed-edge-pO tests mutual subsumption against all passive edges that span the same portion of the input string.</Paragraph>
    <Paragraph position="10"> When forward subsumption (or equivalence) is detected and the existing edge old is not blocked, regular proactive packing is performed (adding new to the packing list for old) and the procedure returns immediately. 6 In the case of backward subsump6packing an edge el into another edge e2 logically means that e2 will henceforth serve as a representative for el and the derivation(s) that it encodes. In practice, el is removed from the chart and ignored in subsequent parser action and subsumption tests. Only in unpacking the parse forest will  tion, analyses packed into old are raised into new (using the append operator '~' because new can attract multiple existing edges in the loop); old itself is only packed into new when it is not blocked already.</Paragraph>
    <Paragraph position="11"> Finally, old is frosted, its derivatives are recursively frozen, and old is deleted from the chart. In contrast to proactive packing, the top-level loop in the procedure continues so that new can pick up additional edges retroactively. However, once a backward subsumption is detected, it follows that no proactive packing can be achieved for new, as the chart cannot contain an edge that is more general than old.</Paragraph>
  </Section>
  <Section position="6" start_page="165" end_page="166" type="metho">
    <SectionTitle>
4 Empirical Results
</SectionTitle>
    <Paragraph position="0"> We have carried out an evaluation of the algorithms presented above using the LinGO grammar (Flickinger &amp; Sag, 1998), a publicly-available, multipurpose, broad-coverage HPSG of English developed at CSLI Stanford. With roughly 8,000 types, an average feature structure size of around 300 nodes, and 64 lexical and grammar rules (fleshing out the interaction of HPSG ID schemata, wellformedness principles, and LP constraints), LinGO is among the largest HPSG grammars available. We used the LKB system (Copestake, 1992, 1999) as an experimentation platform since it provides a parameterisable bottom-up chart parser and precise, fine-grained profiling facilities (Oepen &amp; Flickinger, 1998). 7 All of our results were obtained in this environment, running on a 300 Mhz UltraSparc, and using a balanced test set of 2,100 sentences extracted from VerbMobil corpora of transcribed speech: input lengths from 1 to 20 words are represented with 100 test items each; although sentences in the corpus range up to 36 words in length there are relatively few longer than 20 words.</Paragraph>
    <Paragraph position="1"> the category of el and its decomposition(s) in daughter edges (and corresponding subtrees) be used again, to multiply out and project local ambiguity.</Paragraph>
    <Paragraph position="2"> ;'The LinGO grammar and LKB software are publicly available at 'http://lingo. stanford, edu/'.</Paragraph>
    <Paragraph position="3">  on the total chart size (truncated above 25 words).</Paragraph>
    <Paragraph position="4"> Figure 3 compares total chart size (in all-paths mode) for the regular LKB parser and our variant with pro- and retroactive packing enabled. Factoring ambiguity reduces the number of passive edges by a factor of more than three on average, while for a number of cases the reduction is by a factor of 30 and more. Compared to regular parsing, the rate of increase of passive chart items with respect to sentence length is greatly diminished.</Paragraph>
    <Paragraph position="5"> To quantify the degree of packing we achieve in practice, we re-ran the experiment reported by Moore and Alshawi (1992): counting the number of nodes required to represent all readings for a simple declarative sentence containing zero to six prepositional phrase (PP) modifiers. The results reported by Moore and Alshawi (1992) (using the CLE grammar of English) and those obtained using pro- and retroactive packing with the LinGO grammar are presented in Table 1. 8 Although the comparison involves different grammars we believe it to be instructive, since (i) both grammars have comprehensive coverage, (ii) derive the same numbers of readings for all test sentences in this experiment, (iii) require (almost) the same number of nodes for the basic cases (zero and one PP), (iv) exhibit a similar size in nodes for one core PP (measured by the increment from n = 0 to n = 1), and (v) the syntactic simplicity of the test material hardly allows crosstalk SMoore and Alshawi (1992) use the terms 'node' and 'record' interchangeably in their discussion of packing, where the CLE chart is comprised of separate con(stituent) and ana(lysis) entries for category and dominance information, respectively. It is unclear whether the counting of 'packed nodes' in Moore and Alshawi (1992) includes con records or not, since only maa records are required in parse tree recovery. In any case, both types of chart record need to be checked by subsumption as new entries are added to the chart. Conversely, in our setup each edge represents not only the node category, but also pointers to the daughter(s) that gave rise to this edge, and moreover, where applicable, a list of packed edges that are subsumed by the category (but not necessarily by the daughters). For the LKB, the column 'result edges' in Table 1 refers to the total number of edges in the chart that contribute to at least one complete analysis.</Paragraph>
    <Paragraph position="6">  labeled '+' show the relative increase of packed nodes (result edges) normalised to the n -- 0 baseline. with other grammatical phenomena. Comparing relative packing efficiency with increasing ambiguity (the columns labeled '-' in Table 1), our method appears to produce a more compact representation of ambiguity than the CLE, and at the same time builds a more specific representation of the parse forest that can be unpacked without search. To give an impression of parser throughput, Table 1 includes timings for our parsing and unpacking (validation) phases, contrasted with the plain, non-packing LKB parser: as would be expected, parse time increases linearly in the number of edges, while unpacking costs reflect the exponential increase in total numbers of analyses; the figures show that our packing scheme achieves a very significant speedup, even when unpacking time is included in the comparison.</Paragraph>
  </Section>
  <Section position="7" start_page="166" end_page="167" type="metho">
    <SectionTitle>
5 Choosing the Grammar Restrictor
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="166" end_page="167" type="sub_section">
      <SectionTitle>
and Parsing Strategy
</SectionTitle>
      <Paragraph position="0"> In order for the subsumption relation to apply meaningfully to HPSG signs, two conditions must be met.</Paragraph>
      <Paragraph position="1"> Firstly, parse tree construction must not be duplicated in the feature structures (by means of the HPSG DTRS feature) but be left to the parser (i.e.</Paragraph>
      <Paragraph position="2"> recorded in the chart); this is achieved in a standard way by feature structure restriction (Shieber, 1985) applied to all passive edges. Secondly, the processing of constraints that do not restrict the search space but build up new (often semantic) structure should be postponed, since they are likely to interfere with subsumption. For example, analyses that differ only with respect to PP attachment would have the same syntax, but differences in semantics may prevent them being packed. This problem can be overcome by using restriction to (temporarily) remove such (semantic) attributes from lexical entries and also from the rule set, before they are input to the parser in the initial parse forest construction phase. The second, unpacking phase of the parser reverts to the unrestricted constraint set, so we can allow overgeneration in the first phase and filter globally inconsistent analyses during unpacking. Thus, the right choice of grammar restrictor can be viewed as an empirical rather than analytical problem.</Paragraph>
      <Paragraph position="3"> Table 2 summarizes packing efficiency and parser performance for three different restrictors (labeled no, partial, and full semantics, respectively); to gauge effects of input complexity, the table is further subdivided by sentence length into two groups (of around 1,000 sentences each). Compared to regular parsing, packing with the full semantics in place is not effective: the chart size is reduced slightly, but the extra cost for testing subsumption increases total parse times by a factor of more than four. Eliminating all semantics (i.e. the entire HPSG C0NT value), on the other hand, results in overgeneralisation: with less information in the feature structures we achieve the highest number of packings, but at the same time rules apply much more freely, resulting in a larger chart compared to parsing with a partial semantics; moreover, unpacking takes longer because the parse forest now contains inconsistent analyses.</Paragraph>
      <Paragraph position="4"> Restricting compositional semantics but preserving attributes that participate in selection and agreement results in minimal chart size and parsing time (shown in the partial semantics figures) for both divisions of the test corpus.</Paragraph>
      <Paragraph position="5"> The majority of packings involve equivalent feature structures which suggests that unpacking could be greatly simplified if the grammar restrictor was guaranteed to preserve the generative capacity of the grammar (in the first parsing phase); then, only packings involving actual subsumption would have to be validated in the unpacking phase. 9 Finally, 9There is room for further investigation here: partly for theory-internal reasons, current development of the LinGO grammar is working towards a stricter separation of restrictive (selectional) and constructive (compositional) constraints in  ('-~') and retroactive ('r') packings, and the number of edges that were frozen ('+-'). we note that the number of retroactive packings is relatively small, and on average each such packing leads to only one previously derived edge being invalidated. This, of course, is a function of the order in which edges are derived, i.e. the parsing strategy.</Paragraph>
      <Paragraph position="6"> All the results in Table 2 were obtained with a 'right corner' strategy which aims to exhaust computation for any suffix of the input string before moving the input pointer to the left; this is achieved by start (where start means of a scoring function end - -Wand end are the vertices of the derivation that would result from the computation, and n is the total input length) that orders parser tasks in the agenda. However, we have observed (Oepen &amp; Callmeier, 2000) that HPSG-type, highly lexicalized grammars benefit greatly from a bidirectional, 'key'-driven, active parsing regime, since they often employ rules with underspecified arguments that are only instantiated by coreference with other daughters (where the 'key' daughter is the linguistic head in many but not all constructions). This requirement and the general non-predictability of categories derived for any token substring (in particular with respect to unary rule applications), means that a particular parsing strategy may reduce retroactive packing but cannot avoid it in general. With pro- and retroactive packing and the minimal accounting overhead, we find overall parser throughput to be very robust against variation in the parsing strategy. Lavie and Rosd (2000) present heuristics for ordering parser actions to achieve maximally compact parse forests--though only with respect to a CF category backbone---in the absence of retroactive packing; however, the techniques we have presented here allow local ambiguity packing and parser tuning--possibly including priority-driven best-first search--to be carried out mostly independently of each other.</Paragraph>
      <Paragraph position="7"> the grammar and underlying semantic theory. We expect that our approach to packing will benefit from these developments.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML