File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-2227_metho.xml
Size: 13,707 bytes
Last Modified: 2025-10-06 14:15:02
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2227"> <Title>Head-Driven Generation with HPSG</Title> <Section position="4" start_page="0" end_page="1393" type="metho"> <SectionTitle> 2 Head-Driven Generation </SectionTitle> <Paragraph position="0"> We assume that generation starts from logical forms, which may be represented for HPSG as typed feature structures. Logical form is not a separate linguistic level in HPSG, but is equated with semantic content.</Paragraph> <Paragraph position="1"> In this section, we take the starting logical form for generation to be a semantic feature structure which will be identical to the CONTENT feature of the top-level HPSG sign to be generated.</Paragraph> <Section position="1" start_page="0" end_page="1393" type="sub_section"> <SectionTitle> 2.1 Semantic heads </SectionTitle> <Paragraph position="0"> Head-driven generation algorithms are based on the idea that most grammar rules have a semantic head daughter whose logical form is identical to the logical form of the mother. The bottom-up generation (BUG) algorithm of van Noord (1990) requires every rule to have such a head (except lexical entries). The semantic head-driven (SHD) algorithm of Shieber et hi. (1990) relaxes this, dividing rules into chain rules with such a head (processed bottom-up), and non-chain rules (processed top-down). The chart-based semantic head-driven (CSHD) algorithm 1 of Haruno et al. (1996) increases efficiency by using a chart to eliminate recomputation of partial results.</Paragraph> <Paragraph position="1"> Head-driven bottom-up generation is efficient as it is geared both to the input logical form (headdriven) and to lexical information (bottom-up). It is good for HPSG, which is highly lexicalist and has</Paragraph> <Paragraph position="3"> a clear definition of semantic head: in head-adjunct phrases, the adjunct daughter is the semantic head; in other headed phrases, the syntactic head daughter is the semantic head. In both cases, the Semantics Principle basically requires the content of the semantic head to be identical to the content of the mother. If we ignore coordinate structures, and if we equate logical form with semantic content for now, then all HPSG grammar rules are SHD chain rules, meeting the requirement of the BUG algorithm.</Paragraph> </Section> <Section position="2" start_page="1393" end_page="1393" type="sub_section"> <SectionTitle> 2.2 HPSG in ProFIT </SectionTitle> <Paragraph position="0"> ProFIT: Prolog with Features, Inheritance and Templates (Erbach, 1995) is an extension of Prolog which supports inheritance-based typed feature structures.</Paragraph> <Paragraph position="1"> The type hierarchy is declared in a signature, which defines subtypes and appropriate features of every type. Terms with typed feature structures can then be used alongside normal terms. Using the signature declarations, the ProFIT system compiles the typed feature structures into normal Prolog terms, which can be compiled by the Prolog system.</Paragraph> <Paragraph position="2"> Figure 1 shows some implementation details. We use ProFIT templates (defined by ':=') for principies such as the Head Feature Principle ('HFP') and Semantics Principle ('SemP'). Templates are expanded where they are invoked (by @'HFP' or @'SemP'). The type hierarchy includes the phrase type hierarchy of Sag (1997). As ProFIT does not support dynamic constraints, we use templates to specify phrasal constraints. For example, for headnexus phrases, the hd__nexus_ph template specifies the <hd_nexus_ph type, invokes general constraints on headed phrases (such as HFP) by @hd_ph, and invokes the Semantics Principle by @'SetuP'.</Paragraph> <Paragraph position="3"> Immediate dominance schemata are implemented as PSG rules, using schematic categories word and phrase, not traditional categories (NP, VP etc). To simplify the generator, the semantic head is first in the list of daughters. Linear precedence is specified by the PHON strings, implemented as Prolog difference lists. Example rules for Head-Subject and Head-Complements Schemata are shown in Figure 1.</Paragraph> </Section> <Section position="3" start_page="1393" end_page="1393" type="sub_section"> <SectionTitle> 2.3 HPSG Interface for BUG1 </SectionTitle> <Paragraph position="0"> van Noord (1990) implements the BUG algorithm as BUGI in Prolog. For HPSG, we add the ProFIT interface in Figure 2. Templates identify the head features (HF) and logical form (LF), and keep the algorithm independent from HPSG internal details.</Paragraph> <Paragraph position="1"> Note that link, used by van Noord (1990) to improve the efficiency of the algorithm, is replaced by the HPSG Head Feature Principle.</Paragraph> <Paragraph position="3"> predict_word(Node, Small), connect(Small, Node).</Paragraph> <Paragraph position="4"> connect(Node, Node).</Paragraph> </Section> </Section> <Section position="5" start_page="1393" end_page="1395" type="metho"> <SectionTitle> 3 Quantifiers and Context </SectionTitle> <Paragraph position="0"> Head-driven generation as in Section 2 works fine if the semantics is strictly head-driven. All semantic information must be inside the CONTENT feature, and cannot be distributed in other features such as QSTORE or BACKGR. When an NP is assigned to the semantic role of a verb, the whole of the NP's CONTENT must be assigned, not only its INDEX.</Paragraph> <Paragraph position="1"> This differs significantly from HPSG theory.</Paragraph> <Section position="1" start_page="1393" end_page="1393" type="sub_section"> <SectionTitle> 3.1 Quantifier Storage and Retrieval </SectionTitle> <Paragraph position="0"> There is a complication in Pollard and Sag (1994) caused by the use of Cooper storage to handle scope ambiguities. While scoped quantifiers are included in the QUANTS list within CONTENT, unscoped quantifiers are stored in the QSTORE set outside CONTENT. So logical form for generation needs to include QSTORE as well as CONTENT.</Paragraph> <Paragraph position="1"> In this approach, a quantifier may be retrieved at any suitable syntactic node. A quantifier retrieved at a particular node is a member of the QSTORE set (but not the QUANTS list) of some daughter of that node. Due to the retrieval it is a member of the QUANTS list (but not the QSTORE set) of the mother node. Pollard and Sag (1994) define a modified Semantics Principle to cater for this, but the effect of retrieval on QSTORE and QUANTS means that the mother and the semantic head daughter must have different logical forms. The daughter is the semantic head by the HPSG definition, but not as required by the generation algorithm.</Paragraph> </Section> <Section position="2" start_page="1393" end_page="1395" type="sub_section"> <SectionTitle> 3.2 Contextual Background </SectionTitle> <Paragraph position="0"> In addition to semantic content, natural language generation requires presuppositions and other pragmatic and discourse factors. In HPSG, such factors are part of CONTEXT. To specify these factors for generation, the usual approach is to include them in the logical form. So logical form needs to include CONTEXT as well as CONTENT and QSTORE.</Paragraph> <Paragraph position="1"> This extended logical form is defined for BUG1 by replacing the ProFIT template for 'lf(LF)' shown in Figure 2 with the new template in Figure 4.</Paragraph> <Paragraph position="3"> However, head-driven generation does not work with this inclusive logical form, given the theory of Pollard and Sag (1994). Even if we ignore quantifier retrieval and look at a very simple sentence, there is a fundamental difficulty with CONTEXT.</Paragraph> <Paragraph position="4"> Figure 3, from Wilcock (1997), shows the HPSG analysis of she saw Kim. Note that she has a non-empty BACKGR set (shown by tag \[\]), stating a pragmatic requirement that the referent is female.</Paragraph> <Paragraph position="5"> This background condition is part of CONTEXT, and is passed up from NP to S by the Principle of Contextual Consistency. Similarly, Kim has a background condition (shown by tag \[\]) that the referent bears this name. This is also passed from NP to VP, and from VP to S.</Paragraph> <Paragraph position="6"> S, VP and V share the same CONTENT (shown by tag ill). If logical form is restricted to semantic content as in Figure 2, then V is the semantic head of VP and VP is the semantic head of S, not only in terms of the HPSG definition but also in terms of the BUG algorithm. In this case, saw can be found immediately by predict_word in BUG1.</Paragraph> <Paragraph position="7"> But if we extend logical form as in Figure 4, to include the context factors required for adequate realization, it is clear from Figure 3 that S does not have the same logical form as VP, and VP does not have the same logical form as V, as their BACKGR sets differ. Therefore, although V is still the semantic head of VP according to the HPSG definition, it is not the semantic head according to the BUG algorithm. Similarly, VP is still the semantic head of S for HPSG, but it is not the semantic head for BUG. In this case, predicl;_word cannot find any semantic head word in the lexicon, and BUG1 cannot generate the sentence.</Paragraph> </Section> </Section> <Section position="6" start_page="1395" end_page="1396" type="metho"> <SectionTitle> 4 Revising the Grammar </SectionTitle> <Paragraph position="0"> If we include unscoped quantifiers and contextual background in logical form, we see that there are two different definitions of &quot;semantic head&quot;: the HPSG definition based on adjunct daughter or syntactic head daughter, and the BUG algorithm definition based on identity of logical forms. However, recent proposals for changes in HPSG theory suggest that the two notions of semantic head can be brought back together.</Paragraph> <Section position="1" start_page="1395" end_page="1395" type="sub_section"> <SectionTitle> 4.1 Lexical amalgamation in HPSG </SectionTitle> <Paragraph position="0"> In Pollard and Sag (1994), QSTORE and BACKGR sets are phrasally amalgamated. The Quantifier Inheritance Principle requires a phrase's QSTORE to be the set union of the QSTOREs of all daughters, minus any quantifiers in the phrase's RETRIEVED list. The Principle of Contextual Consistency requires a phrase's BACKGR to be the set union of the BACKGR sets of all the daughters.</Paragraph> <Paragraph position="1"> It has recently been proposed that these sets should be lezically amalgamated. A syntactic head word's arguments are now lexically specified in its ARGUMENT-STRUCTURE list. The word's set-valued features can therefore be defined in terms of the amalgamation of the set-valued features of its arguments.</Paragraph> <Paragraph position="2"> Lexical amalgamation of quantifier storage was proposed by Pollard and Yoo (1995). They change QSTORE into a local feature which can be included in the features subcategorized for by a lexical head, and can therefore be lexically amalgamated in the head. A phrase no longer inherits unscoped quantifiers directly from all daughters, instead they are inherited indirectly via the semantic head daughter. Lexical amalgamation of CONTEXT, proposed by Wilcock (1997), follows the same approach. As CONTEXT is a local feature, it can be subcategorized for by a head word and lexically amalgamated in the head by means of a BACKGR amalgamation constraint. Instead of a phrase inheriting BACKGR conditions directly from all daughters by the Principle of Contextual Consistency, they are inherited indirectly via the &quot;contextual head&quot; daughter which is the same as the semantic head daughter.</Paragraph> </Section> <Section position="2" start_page="1395" end_page="1396" type="sub_section"> <SectionTitle> 4.2 Lexical amalgamation in ProFIT </SectionTitle> <Paragraph position="0"> In the ProFIT implementation, QSTORE sets and BACKGR sets are Prolog difference lists. Lexical amalgamation of both sets is shown in Figure 5, the lexical entry for the verb &quot;saw&quot;. The subject's BACKGR set B0-B1 and the object's BACKGR set B1-BN are amalgamated in the verb's BACKGR set B0-BN. The subject and object QSTORE sets, Q0- null The basic Semantics Principle, for semantic content only, was implemented by the ProFIT templates 'SemP' and 'SemP'(adjunct) as shown in Figure 1.</Paragraph> <Paragraph position="1"> In order to include unscoped quantifiers and background conditions in logical form, as in Figure 4, and still make it possible for the logical form of a phrase to be identical to the logical form of its semantic head, the Semantics Principle is replaced and extended. As proposed by Wilcock (1997), we need three principles: Semantic Head Inheritance Principle (SHIP), Quantifier Inheritance Principle (QUIP), and Contextual Head Inheritance Principle (CHIP). These are implemented by templates as shown in Figure 6 (only the non-adjunct forms are shown). To include the three principles in the grammar, the template for hd_nexus_ph in Figure 1 is extended as shown in Figure 6.</Paragraph> <Paragraph position="3"> With these revisions, it is possible to include unscoped quantifiers and background conditions in the starting logical form, and perform head-driven generation successfully using the BUG1 generator.</Paragraph> <Paragraph position="4"> However, there remain various technical difficulties in this implementation. The ProFIT system does not support either dynamic constraint checking or set-valued features. The methods shown (template expansion and difference lists) are only partial substitutes for the required facilities.</Paragraph> </Section> </Section> class="xml-element"></Paper>