File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/95/e95-1024_metho.xml
Size: 19,043 bytes
Last Modified: 2025-10-06 14:14:01
<?xml version="1.0" standalone="yes"?> <Paper uid="E95-1024"> <Title>Off-line Optimization for Earley-style HPSG Processing</Title> <Section position="3" start_page="173" end_page="174" type="metho"> <SectionTitle> 2 Advanced Earley Generation </SectionTitle> <Paragraph position="0"> As Shieber (1988) noted, the main shortcoming of Earley generation is a lack of goal-directedness that results in a proliferation of edges. Gerdemann (1991) tackled this shortcoming by modifying the restriction function to make top-down information available for the bottom-up completion step. Gerdemann's generator follows a head-driven strategy in order to avoid inefficient evaluation orders. More specifically, the head of the right-hand side of each grammar rule is distinguished, and distinguished categories are scanned or predicted upon first. The resulting evaluation strategy is similar to that of the head-corner approach (Shieber et al., 1990; Gerdemann and IIinrichs, in press): prediction follows the main flow of semantic information until a lexical pivot is reached, and only then are the head-dependent subparts of the construction built up in a bottom-up fashion. This mixture of top-down and bottom-up information flow is crucial since the top-down semantic information from the goal category must be integrated with the bottom-up subcategorization information from the lexicon. A strict top-down evaluation strategy suffers from what may be called head-recursion, i.e. the generation analog of left recursion in parsing. Shieber et al. (1990) show that a top-down evaluation strategy will fail for rules such as vP --* vp x, irrespective of the order of evaluation of the right-hand side categories in the rule.</Paragraph> <Paragraph position="1"> By combining the off-line optimization process with a mixed bottom-up/top-down evaluation strategy, we can refrain from a complete reformulation of the grammar as, for example, in Minnen et al. (1995).</Paragraph> <Section position="1" start_page="173" end_page="173" type="sub_section"> <SectionTitle> 2.1 Optimizations </SectionTitle> <Paragraph position="0"> We further improved a typed extension of Gerdemann's Earley generator with a number of techniques that reduce the number of edges created during generation. Three optimizations were especially helpful. The first supplies each edge in the chart with two indices, a backward index pointing to the state in the chart that the edge is predicted from, and a forward index poinfing to the states that are predicted from the edge. By matching forward and backward indices, the edges that must be combined for completion can be located faster. This indexing technique, as illustrated below, improves upon the more complex indices in Gerdemann (1991) and is closely related to OLDT-resolution (Tamaki and Sato, 1986).</Paragraph> <Paragraph position="2"> Active edge 2 resulted from active edge 1 through prediction. The backward index of edge 2 is therefore identified with the forward index of edge 1.</Paragraph> <Paragraph position="3"> Completion of an active edge results in an edge with identical backward index. In the case of our example, this would be the steps from edge 2 to edge 3 and edge 3 to edge 4. As nothing gets predicted from a passive edge (4), it does not have a forward index. In order to use passive edge 4 for completion of an active edge, we only need to consider those edges which have a forward index identical to the backward index of 4.</Paragraph> <Paragraph position="4"> The second optimization creates a table of the categories which have been used to make predictions from. As discussed in Gerdemann (1991), such a table can be used to avoid redundant predictions without a full and expensive subsumption test. The third indexes lexical entries which is necessary to obtain constant-time lexical access.</Paragraph> <Paragraph position="5"> The optimizations of our Earley-generator lead to significant gains in efficiency. However, despite these heuristic improvements, the problem of goal-directedness is not solved.</Paragraph> </Section> <Section position="2" start_page="173" end_page="174" type="sub_section"> <SectionTitle> 2.2 Empty Heads </SectionTitle> <Paragraph position="0"> Empty or displaced heads present the principal goal-directedness problem for any head-driven generation approach (Shieber et al., 1990; K6nig, 1994; Gerdemann and IIinrichs, in press), where empty head refers not just to a construction in which the head has an empty phonology, but to any construction in which the head is partially unspecified. Since phonology does not guide generation, the phonological realization of the head of a construction plays no part in the generation of that construction. To better illustrate the problem that underspecified heads pose, consider the sentence: Hal Karl Marie geki~'flt f Has Karl Marie kissed? &quot;Did Karl kiss Mary?&quot; for which we adopt the argument composition analysis presented in Hinrichs and Nakazawa (1989): the subeat list of the auxiliary verb is partially instantiated in the lexicon and only becomes fully instantiated upon its combination with its verbal complement, the main verb. The phrase structure rule that describes this construction is 1 cat 0\]\] subcat cont I cat v fin + aux + subcat ( \['3&quot;1l rY=1 ) Lcont \[\] I cat v lJ L, ubC/~t \[EI \[\] Though a head-driven generator must generate first the head of the rule, nothing prescribes the order of generation of the complements of the head. If the generator generates second the main verb then the subcat list of the main verb instantiates the subcat list of the head, and generation becomes a deterministic procedure in which complements are generated in sequence. However, if the generator generates second some complement other than the main verb, then the subcat list of the head contains no restricting information to guide deterministic generation, and generation becomes a generate-and-test procedure in which complements are generated at random, only to be eliminated by further unifications. Clearly then, the order of evaluation of the complements in a rule can profoundly influence the efficiency of generation, and an efficient head-driven generator must order the evaluation of the complements in a rule accordingly.</Paragraph> <Paragraph position="1"> between the subject and the other complements of a verb as in chapter 9 of Pollard and Sag (1994). The test-grammar does make this division and always guarantees the correct order of the complements on the comps list with respect to the obliqueness hierarchy. Furthermore, we use abbreviations of paths, such as coat for syasemlloc\[coat , and assume that the semantics principle is encoded in the phrase structure rule.</Paragraph> <Paragraph position="2"> run time creates much overhead, and locally determining the optimal evaluation order is often impossible. Goal-freezing can also overcome the ordering problem, but is equally unappealing: goal-freezing is computationally expensive, it demands the procedural annotation of an otherwise declarative grammar specification, and it presupposes that a grammar writer possesses substantial computational processing expertise. We chose instead to deal with the ordering problem by using off-line compilation to automatically optimize a grammar such that it can be used for generation, without additional provision for dealing with the evaluation order, by our Earley generator. null</Paragraph> </Section> </Section> <Section position="4" start_page="174" end_page="176" type="metho"> <SectionTitle> 3 Off-line Grammar Optimization </SectionTitle> <Paragraph position="0"> Our off-line grammar optimization is based on a generalization of the dataflow analysis employed in the DIA to a dataflow analysis for typed feature structure grammars. This dataflow analysis takes as input a specification of the paths of the start category that are considered fully instantiated. In case of generation, this means that the user annotates the path specifying the logical form, i.e., the path cont (or some of its subpaths), as bound. We use the type hierarchy and an extension of the unification and generalization operations such that path annotations are preserved, to determine the flow of (semantic) information between the rules and the lexical entries in a grammar. Structure sharing determines the dataflow within the rules of the grammar.</Paragraph> <Paragraph position="1"> The dataflow analysis is used to determine the relative efficiency of a particular evaluation order of the right-hand side categories in a phrase structure rule by computing the maximal degree of nondeterminacy introduced by the evaluation of each of these categories. The maximal degree of nondeterminacy introduced by a right-hand side category equals the maximal number of rules and/or lexical entries with which this category unifies given its binding annotations. The optimal evaluation order of the right-hand side categories is found by comparing the maximal degree of nondeterminacy introduced by the evaluation of the individual categories with the degree of nondeterminacy the grammar is allowed to introduce: if the degree of nondeterminacy introduced by the evaluation of one of the right-hand side categories in a rule exceeds the admissible degree of nondeterminacy the ordering at hand is rejected.</Paragraph> <Paragraph position="2"> The degree of nondeterminacy the grammar is allowed to introduce is originally set to one and consecutively incremented until the optimal evaluation order for all rules in the grammar is found.</Paragraph> <Section position="1" start_page="174" end_page="175" type="sub_section"> <SectionTitle> 3.1 Example </SectionTitle> <Paragraph position="0"> The compilation process is illustrated on the basis of the phrase structure rule for argument composition discussed in 2.2. Space limitations force us to abstract over the recursive optimization of the rules defining the right-hand side categories through considering only the defining lexical entries.</Paragraph> <Paragraph position="1"> Unifying the user annotated start category with the left-hand side of this phrase structure rule leads to the annotation of the path specifying the logical form of the construction as bound (see below). As a result of the structure-sharing between the left-hand side of the rule and the auxiliary verb category, the cont-value of the auxiliary verb can be treated as bound, as well. In addition, the paths with a value of a maximal specific type for which there are no appropriate features specified, for example, the path cat, can be considered bound:</Paragraph> <Paragraph position="3"> On the basis of this annotated rule, we investigate the lexical entries defining its right-hand side categories. The auxiliary verb category is unified with its defining lexical entries (under preservation of the binding annotations). The following is an example of such a lexical entry. (Note that subpaths of a path marked as bound are considered bound too.)</Paragraph> <Paragraph position="5"> The binding annotations of the lexical entries defining the auxiliary verb are used to determine with how many lexical entries the right-hand side category of the rule maximally unifies, i.e., its maximal degree of nondeterminacy. In this case, the maximal degree of nondeterminacy that the evaluation of the auxiliary verb introduces is very low as the logical form of the auxiliary verb is considered fully instantiated. Now we mark the paths of the defining lexical entries whose instantiation can be deduced from the type hierarchy. To mimic the evaluation of the auxiliary verb, we determine the information common to all defining lexical entries by taking their generalization, i.e., the most specific feature structure subsuming all, and unify the result with the original right-hand side category in the phrase structure rule. Because both the generalization and the unification operations preserve binding annotations, this leads (via structure-sharing) to the annotation that the logical form of the verbal complement can be considered instantiated. Note that the nonverbal complements do not become further instantiated.</Paragraph> <Paragraph position="6"> By subsequent investigation of the maximal degree of nondeterminacy introduced by the evaluation of the complements in various permutations, we find that the logical form of a sentence only restricts the evaluation of the nonverbal complements after the evaluation of the verbal complement. This can be verified on the basis of a sample lexical entry for a main verb.</Paragraph> <Paragraph position="7"> as the optimal evaluation order of our phrase structure rule for argument composition.</Paragraph> </Section> <Section position="2" start_page="175" end_page="176" type="sub_section"> <SectionTitle> 3.2 Processing Head </SectionTitle> <Paragraph position="0"> The optimal evaluation order for a phrase structure rule need not necessarily be head-first. Our dataflow anMysis treats heads and complements alike, and includes the head in the calculation of the optimal evaluation order of a rule. If the evaluation of the head of a rule introduces much nondeterminacy or provides insufficient restricting information for the evaluation of its complements, our dataflow analysis might not select the head as the first category to be evaluated, and choose instead subcat -----+ cont pat v >\] It at v fin 4\[\] aux ~ux + , N, \[\] Lsubeat Fill \[\] L ~degnt \[\] ' as the optimal evaluation order. This clearly demonstrates an extremely important consequence of using our dataflow analysis to compile a declarative grammar into a grammar optimized for generation. Empty or displaced heads pose us no problem, since the optimal evaluation order of the right-hand side of a rule is determined regardless of the head. Our dataflow analysis ignores the grammatical head, but identifies instead the 'processing head', and (no less importantly) the 'first processing complement', the 'second processing complement', and so on.</Paragraph> </Section> </Section> <Section position="5" start_page="176" end_page="176" type="metho"> <SectionTitle> 4 Constraints on Grammar </SectionTitle> <Paragraph position="0"> Our Earley generator and the described compiler for off-line grammar optimization have been extensively tested with a large HPSG grammar. This test-grammar is based on the implementation of an analysis of partial vP topicalization in German (Hinrichs et al., 1994) in the Troll system (Gerdemann and King, 1994). Testing the developed techniques uncovered important constraints on the form of the phrase structure rules in a grammar imposed by the compiler.</Paragraph> <Section position="1" start_page="176" end_page="176" type="sub_section"> <SectionTitle> 4.1 Complement Displacement </SectionTitle> <Paragraph position="0"> The compiler is not able to find an evaluation order such that the Earley generator has sufficient restricting information to generate all subparts of the construction efficiently in particular cases of complement displacement. More specifically, this problem arises when a complement receives essential restricting information from the head of the construction from which it has been extracted, while, at the same time, it provides essential restricting information for the complements that stayed behind. Such a case is represented schematically in figure 1 (see next page).</Paragraph> <Paragraph position="1"> The first processing complement (el) of the head (H) has been displaced. This is problematic in case cl provides essential bindings for the successful evaluation of the complement c2. cl can not be evaluated prior to the head and once H is evaluated it is no longer possible to evaluate cl prior to c2.</Paragraph> <Paragraph position="2"> An example of problematic complement displacement taken from our test-grammar is given in figure 2 (see next page). The topicalized partial vP &quot;Anna lichen&quot; receives its restricting semantic information from the auxiliary verb and upon its evaluation provides essential bindings not only for the direct object, but also for the subject that stayed behind in the Mittelfeld together with the auxiliary verb. These mutual dependencies between the sub-constituents of two different local trees lead either to the unrestricted generation of the partial vP, or to the unrestricted generation of the subject in the Mittelfeld. We handled this problem by partial execution (Pereira and Shieber, 1987) of the filler-head rule. This allows the evaluation of the filler right after the evaluation of the auxiliary verb, but prior to the subject. A head-driven generator has to rely on a similar solution, as it will not be able to find a successful ordering for the local trees either, simply because it does not exist.</Paragraph> </Section> <Section position="2" start_page="176" end_page="176" type="sub_section"> <SectionTitle> 4.2 Generalization </SectionTitle> <Paragraph position="0"> A potential problem for our approach constitutes the requirement that the phrase structure rules in the grammar need to have a particular degree of specificity for the generalization operation to be used successfully to mimic its evaluation. This is best illustrated on the basis of the following, more 'schematic', phrase structure rule: \[cat (}l~.\]\] \[i at v fin ~- >1 (~ ,ff\]NN subcat _.... ubcat ,~\],\[~ ' ' Lcont \[cont Underspecification of the head of the rule allows it to unify with both finite auxiliaries and finite ditransitive main verbs. In combination with the underspecification of the complements, this allows the rule not only to be used for argument composition constructions, as discussed above, but also for constructions in which a finite main verb becomes saturated. This means that the logical form of the nonverbal complements (if\] and \[~) becomes available either upon the evaluation of the complement tagged \[\] (in case of argument composition), or upon the evaluation of the finite verb (in case the head of the rule is a ditransitive main verb). As a result, the use of generalization does not suffice to mimic the evaluation of the respective right-hand side categories. Because both verbal categories have defining lexical entries which do not instantiate the logical form of the nonverbal arguments, the dataflow analysis leads to the conclusion that the logical form of the nonverbal complements never becomes instantiated. This causes the rejection of all possible evaluation orders for this rule, as the evaluation of an unrestricted non-verbal complement clearly exceeds the allowed maximal degree of nondeterminacy of the grammar. We are therefore forced to split this schematic phrase structure rule into two more specific rules at least during the optimization process. It is important to note that this is a consequence of a general limitation of dataflow analysis (see also Mellish, 1981).</Paragraph> </Section> </Section> class="xml-element"></Paper>