File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/w97-1509_metho.xml

Size: 17,584 bytes

Last Modified: 2025-10-06 14:14:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1509">
  <Title>Head-Driven Generation and Indexing in ALE</Title>
  <Section position="4" start_page="62" end_page="63" type="metho">
    <SectionTitle>
3 Input Specification
</SectionTitle>
    <Paragraph position="0"> The reader is referred to (CP94) for a complete specification of ALE's syntax as it pertains to parsing.</Paragraph>
    <Paragraph position="1"> ALE allows the user to refer to feature structures by means of descriptions, taken from a language which allows reference to types (Prolog atoms), feature values (colon-separated paths), conjunction and disjunction (as in Prolog), and structure sharing through the use of variables (with Prolog variables).</Paragraph>
    <Paragraph position="2"> ALE grammar rules simply consist of a series of these descriptions, one for each daughter and one for the mother, interspersed with procedural attachments from ALE's Prolog-like language. The following is a typical S ~ NP VP rule taken from a simple ALE grammar: srule rule</Paragraph>
    <Paragraph position="4"> seN_head&gt; (vp, phon : VpPhon, form: Form, subcat : \[Subj \], sem: S), goal&gt; append (Subj Phon, VpPhon, SPhon). The description of a sentence-typed feature structure before the ===&gt; is the description of the mother category. The operator, cat&gt;, identifies a daughter description, here used for the subject NP, and goal&gt; identifies a call to a procedural attachment, whose arguments are Prolog variables instantiated to their respective phonologies (the values of feature, phon). seN..head&gt; is a new operator which identifies the daughter description corresponding to the semantic head of a rule, according to (SNMP90)'s definition.</Paragraph>
    <Paragraph position="5"> Grammar rules can have at most one seN_head&gt; declaration; and those which have one are identified as chain rules.</Paragraph>
    <Paragraph position="6"> The only other special information the user needs to provide is what constitutes the semantic component of a feature structure. ALE uses a distinguished predicate, seN_select (+, -), from its procedural attachment language in order to identify this material, e.g.: sem_select(seN:S,S) if true.</Paragraph>
    <Paragraph position="7"> In general, this material may be distributed over various substructures of a given feature structure, in which case the predicate may be more complex: sem_seleet ( (s ign, synsem: coat : Coat, retrieved_quants : QR), (seN, c:Cont,q:QR)) if no_free_vats (QR).</Paragraph>
    <Paragraph position="8">  Notice that such grammars can still be compiled by ALE's parsing compiler: the sere_select/2 predicate can simply be ignored, and a sem~ead&gt; operator can be interpreted exactly as cat&gt;. In the general case, however, a particular grammar rule will not compile into efficient, or even terminating, code in both modes, particularly when procedural attachments are used. Just as in the case of Prolog, the user is responsible for ordering the procedural attachments (subgoals) with respect to their daughter categories and with respect to each other to ensure proper termination for a particular mode of processing. Just as in Prolog, one could also modify ALE to assist, to an extent, by augmenting ALE's procedural attachments with mode declarations which can be enforced by static analysis during compilation. At this point, one could also adapt techniques for automatic mode reversal from logic programming ((Str90; MGH93)) to grammar rules to obtain the minimum amount of manual modification necessary.</Paragraph>
  </Section>
  <Section position="5" start_page="63" end_page="65" type="metho">
    <SectionTitle>
4 Compilation
</SectionTitle>
    <Paragraph position="0"> All ALE compilation up to, and including, the level of descriptions applies to generation without change.</Paragraph>
    <Paragraph position="1"> This includes compiled type inferencing, feature value access functions, and the feature structure unification code itself. I This level is a very important and convenient stage in compilation, because descriptions serve as the basic building blocks of all higher-level components in ALE. One of these components, ALE's procedural attachment language, can also be compiled as in the parsing case, since it uses the same SLD resolution strategy. The rest are described in the remainder of this section.</Paragraph>
    <Section position="1" start_page="63" end_page="64" type="sub_section">
      <SectionTitle>
4.1 Grammar Rules
</SectionTitle>
      <Paragraph position="0"> Chain rules and non-chain rules are compiled differently because (SNMP90)'s Mgorithm uses a different control strategy with each of them. Both of them are different from the strategy which ALE's bottom-up parser uses. All three, however, vary only slightly in their use of building blocks of code for enforcing descriptions on feature structures. These building blocks of code will be indicated by square brackets, e.g. \[add Desc to FS\].</Paragraph>
      <Paragraph position="1">  Non-chain rules have no semantic head, and are simply processed top-down, using the mother as a pivot. We also process the daughters from left to right. So the non-chain rule: *(CP96) provides complete details about this level of  compilation.</Paragraph>
      <Paragraph position="2"> DO ===&gt; DI, ..., DN.</Paragraph>
      <Paragraph position="3"> consisting of descriptions DO through DN, is compiled to: non_chain_rule (+PivotFS, +RootFS, ?Ws, ?WsRest) &amp;quot;\[add DO to PivotFS\], exists_chain (PivotFS, RootFS), \[add D1 to FS1\], generat * (FS i, SubWs, SubWs 2), \[add D2 to FS2\], generate (FS2, SubWs2, SubWs3), \[add DN to FSN\], generate (FSN, SubWsN, SubWsRest), connect (PivotFS, RootFS, SubWs, SubWsRest,  Ws, WsRest).</Paragraph>
      <Paragraph position="4"> non_chain_rule/4 is called whenever a non-chain rule's mother is selected as the pivot (by successfully adding the mother's description, DO, to PivotFS), generating a string represented by the difference list, Ws-WsRest. The algorithm says one must recursively generate each daughter (generate/3), and then connect this pivot-rooted derivation tree to the root (connect/6). Before we spend the effort on recursive calls, we also want to know whether this pivot can in fact be connected to the root; this is accomplished by exists_chain/2. In general, the mother category and daughter categories may share substructures, through the co-instantiation of Prolog variables in their descriptions. After matching the mother's description, which will bind those variables, we add each daughters' description to a new structure gsi, initially a structure of type bot (the most general type in ALE), before making the respective recursive call. In this way, the appropriate information shared between descriptions in the user's grammar rule is passed between feature structures at run-time.</Paragraph>
      <Paragraph position="5"> To generate, we use the user's distinguished selection predicate to build a candidate pivot, and then try to match it to the mother of a non-chain rule (the base cases will be considered below):</Paragraph>
      <Paragraph position="7"> solve/1 is ALE's instruction for making calls to its procedural attachment language. Its clauses are compiled from the user's predicates, which have description arguments, into predicates with feature structure arguments as represented internally in ALE.</Paragraph>
      <Paragraph position="8">  Chain rules are used to connect pivots to goals. As a result, we use them bottom-up from semantic head to mother, and then recursively generate the non-head daughters top-down, left to right. So a  chain rule: DO ===&gt; D1, ..., DK, HI), D(K+I) ..... DN. is compiled to: chain_rule (+PivotFS, +RootFS, +SubWs, -SubWsRest, ?Ws, ?WsRest) * \[add HI) to PivotFS\], \[add DO to MotherFS\] exist s_chain (MotherFS, RootFS), \[add D1 to FSI\], generate (FS1, SubWs, SubWs 2), \[add DK to FSK\], generate (FSK, SubWsK, SubWsK+1 ), \[add D(K+I) to FS(K+I)\], generate (FS (K+I), SubWsK+ i, SubWsK+2), ...</Paragraph>
      <Paragraph position="9"> \[add DN to FSN\], generate (FSN, SubWsN, SubWsRes t ), connect (MotherFS, RootFS, SubWs, SubWsRest,  Ws, WsRest).</Paragraph>
      <Paragraph position="10"> chain_rule/6 is called whenever a chain rule is selected to connect a pivot (PivotFS) to a root goal (RootFS), yielding the string Ws-WsRest, which contains the substring, SubWs-SubWsRest. In the case of both chain and non-chain rules, calls to a procedural attachment between daughter Di and D (i+l) are simply added between the code for Di and D(i+l). Procedures which attach to the semantic head, in the case of chain rules, must be distinguished as such, so that they can be called earlier.</Paragraph>
      <Paragraph position="11"> To connect a pivot to the root, we either unify them (the base case):  Similarly, to discover whether a chain exists, we either unify, or attempt to use one or more chain rules. For each chain rule, we can, thus, compile a separate clause for exists_chain/2, for which that rule is the last step in the chain. In practice, a set of chain rules may have potentially unbounded length chains. For this reason, we bound the length with a constant declared by the user directive, max_chain_length/1.</Paragraph>
    </Section>
    <Section position="2" start_page="64" end_page="64" type="sub_section">
      <SectionTitle>
4.2 Lexical Entries
</SectionTitle>
      <Paragraph position="0"> Lexical entries are the base cases of the algorithm's top-down processing, and can be chosen as pivots instead of the mothers of non-chain rules. In fact, lexical entries can be compiled exactly as a non-chain rule with no daughters would be. So a lexical entry for W, with description, D, can be compiled into the non_chain_rule/4 clause: non_chain_rule (P ivotFS, RootFS, Ws, WsRest) : \[add D to PivotFS\], connect (PivotFS, RootFS, \[W I SubWs\], SubWs, Ws, WsRest).</Paragraph>
      <Paragraph position="1"> For ALE's bottom-up parser, lexical entries were compiled into actual feature structures. Now they are being compiled into code which executes on an already existing feature structure, namely the most general satisfier of what is already known about the current pivot. Empty categories are compiled in the same way, only with no phonological contribution.</Paragraph>
      <Paragraph position="2"> This method of compilation is re-evaluated in Section 6.</Paragraph>
    </Section>
    <Section position="3" start_page="64" end_page="65" type="sub_section">
      <SectionTitle>
4.3 Lexical Rules
</SectionTitle>
      <Paragraph position="0"> a non-finite VP is mapped to a finite VP, provided the attachment, add.Jg3/2 succeeds in transforming the SUBCAT value to reflect agreement.</Paragraph>
      <Paragraph position="1"> For parsing, ALE unfolds the lexicon at compile-time under application of lexical rules, with an upper bound on the depth of rule application. This was possible because lexical items were feature structures to which the code for lexical rules could apply. In the generator, however, the lexical entries themselves are compiled into pieces of code. One solution is to treat lexical rules as special unary non-chain rules, whose daughters can only have pivots corresponding to lexical entries or other lexical rules, and with bounded depth. Because the  application depth is bounded, one can also unfold these lexical rule applications into the lexical entries' non_chain..rule/4 predicates themselves. Given a lexical entry, W ---&gt; DescLex, and lexical rule, DescIn **&gt; DescOut morphs M, for example, we can create the clause: non_ chain_rule (Pivot FS, RootFS, Ws, WsRest ) : - null \[add DescOut to PivotFS\], \[add DescIn to LexFS\], \[add DescLex to LexFS\], connect (PivotFS, RootFS, \[Morp:hW I SubWs\],</Paragraph>
    </Section>
    <Section position="4" start_page="65" end_page="65" type="sub_section">
      <SectionTitle>
SubWs ,Ws ,WsRest).
</SectionTitle>
      <Paragraph position="0"> where MorphW is the result of applying N to W. For most grammars, this code can be heavily optimized by peephole filtering. At least part of all three descriptions needs to be enforced if there are shared structures in the input and output of the lexical rule, in order to link this to information in the lexical entry. null</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="65" end_page="65" type="metho">
    <SectionTitle>
5 Example
</SectionTitle>
    <Paragraph position="0"> An example derivation is given in Figure 1 which uses these grammar rules:  The rules, s and vp, are chain rules, as evidenced by their possession of a semantic head. sent is a non-chain rule. Processing proceeds in alphabetical order of the labels. Arrows show the direction of control-flow between the mother and daughters of a rule. Given the input feature structure shown in (a), we obtain its semantics with sere_select and unify it with that of sent's mother category to obtain the first pivot, sent's daughter, (b), must then be recursively generated. Its semantics matches that of the lexieal entry for &amp;quot;calls,&amp;quot; (c), which must then be linked to (b) by chain rules. The semantic head of chain rule vp matches (c), to produce a mother, (d), which must be further linked, and a non-head daughter, (e), which is recursively generated by using the lexical entry for &amp;quot;john.&amp;quot; A second application of vp matches (d), again producing a mother, (f), and a non-head daughter, (g), which is recursively generated by using the lexical entry for &amp;quot;up.&amp;quot; An application of chain rule, s, then produces a non-head daughter, (h), and a mother. This mother is linked to (b) directly by unification.</Paragraph>
  </Section>
  <Section position="7" start_page="65" end_page="67" type="metho">
    <SectionTitle>
6 Indexing
</SectionTitle>
    <Paragraph position="0"> In grammars with very large lexica, generation can be considerably expensive. In the case of ALE's bottom-up parser, our interaction with the lexicon was confined simply to looking up feature structures by their phonological strings; and no matter how large the lexicon was, Prolog first argument indexing provided an adequate means of indexing by those strings. In the case of generation, we need to look up strings indexed by feature structures, which involves a much more expensive unification operation than matching strings. Given ALE's internal representation of feature structures, first argument indexing can only help us by selecting structures of the right type, which, in the case of a theory like HPSG, is no help at all, because every lexical entry is of type, word. (SNMP90) does not consider this problem, presumably because its data structures are much smaller.</Paragraph>
    <Paragraph position="1"> The same problem exists in feature-based chart parsing, too, since we need to find matching feature structure chart edges given a description in a grammar rule. In the case of HPSG, this is not quite as critical given the small number of rules the theory requires. In a grammar with a large number of rules, however, a better indexing technique must be applied to chart edges as well.</Paragraph>
    <Paragraph position="2"> The solution we adopt is to build a decision tree with features and types on the inner nodes and arcs, and code for lexical entries on the leaves. This structure can be built off-line for the entire lexicon and then traversed on-line, using a feature structure in order to avoid redundant, partially successful unification operations. Specifically, a node of the tree is labelled with a feature path in the feature structure; and the arcs emanating from a node, with the possible type values at that node's feature path.</Paragraph>
    <Paragraph position="3"> The chief concern in building this tree is deciding which feature paths should be checked, and in which order. Our method, an admittedly preliminary one, simply indexes by all feature paths which reach into the substructure(s) identified as semantics-related by sere_select/2, such that shorter paths are traversed earlier, and equally short paths are traversed alphabetically. An example tree is shown in Figure 2  A~er the tree is built, a number is assigned to each node and the tree is compiled into a series of Prolog predicates to be used for traversal at run-time, which are then compiled by Prolog. The INDEX:PER node in Figure 2 has the following compiled code:  \[add type ist to V\], node(8,Se~S,PivotFS,RootFS,Ws,WsRest).</Paragraph>
    <Paragraph position="4"> node (7, _, PivotFS ,RootFS ,Ws ,WsRest) * \[add code for he to PivotFS\], connect (PivotFS ,RootFS, \[he \[ SubWs\], SubWs ,Ws ,WsRest).</Paragraph>
    <Paragraph position="5"> node (8, _, PivotFS, RootFS ,Ws, WsRest ) :\[add code for i to PivotFS1, connect (PivotFS ,RootFS, \[i \[ SubWs\], SubNs, Ws, WsRest ).</Paragraph>
    <Paragraph position="6"> Each clause of a non-terminM node/2 finds the value of the current pivot at the current node's feature path, and then calls branch/3, which branches to a new node based on the type of that value. Leaf node clauses add the code for one of possibly many lexical entries. The non_chain.xule/4 clauses of Section 4.2 are then replaced by: non_chain_rule(PivotFS,RootFS,Ns ,NsRest) :solve (sem_select (PivotFS, SemFS) ) node (0, SemFS, P ivotFS, RootFS, Ns, NsRe st).</Paragraph>
    <Paragraph position="7"> As the type check on branches is made by unification, traversal of a tree can, in general, be nondeterministic. Using ALE's internal data structure for feature structures, a check to avoid infinite loops through cyclic structures during compile-time can be made in linear time.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML