File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/87/p87-1014_metho.xml
Size: 21,049 bytes
Last Modified: 2025-10-06 14:12:00
<?xml version="1.0" standalone="yes"?> <Paper uid="P87-1014"> <Title>Functional Unification Grammar Revisited</Title> <Section position="3" start_page="0" end_page="100" type="metho"> <SectionTitle> 1.2 Passive/Active Constraint </SectionTitle> <Paragraph position="0"> Focus of attention can determine whether the passive or active voice should be used in a sentence \[8\]. The constraint dictates that focused information should appear as surface subject in the sentence. In FUG, this can be represented by one pattern indicating that focus should occur f'u'st in the sentence as shown in Figu~ 1. This panern would occur in the sentence category of the grammar, since focus is a sentence constituent. This constraint is represented as part of an alternative so that other syntactic constraints can override it (e.g., if the goal were in focus but the verb could not be pmsivized, ~ constraint would not apply and an active sentence would be generated). The structure of active or passive would be indicated in the verb group as shown in Figure 2.1 The correct choice of active or passive is made through unification of the patterns: active voice is selected if the focus is on the protagonist (focus unifies with pro:) and passive if focus is on the goal or beneficiary Orocus unifies with goal or beheld. This representation has two desirable properties: the constraint can be stated simply and the construction of the resulting choice b expr=ssed separately from the constraint. (alt ( (pattern (focus ...) ) ) ) In the DCG, the unification of argument variables means a single rule can state that focus should occur first in the sentence. However, the rules specifying construction of the passive and active verb phrases must now depend on which role (protagonist, goal, or beneficiary) is in focus.</Paragraph> <Paragraph position="1"> This requires three separate rules, one of which will be chosen depending on which of the three other case roles is the same as the value for focus. The DCG v..presentation thus mixes information from the conswaint, focus of attention, with the passive/active construction, duplicating it over three tThis figure shows only the m'dm, of comtitmmu foe active and passive voice m~l does noC/ include odwr details of the co~au'ucdon.</Paragraph> <Paragraph position="2"> The sentence rule is shown in Figure 3 and the three other rules are presented in Figure 4. The constituents of the proposition are represented as variables of a clause. In Figure 4, the arguments, in order, are verb (V), protagonist (PR), goal (G), beneficiary (B), and focus. The arguments with the same variable name must be equal. Hence, in the Figure, focus of the clause must be equal to the protagonist</Paragraph> <Section position="1" start_page="97" end_page="99" type="sub_section"> <SectionTitle> 1.3 Focus Shift Constraint </SectionTitle> <Paragraph position="0"> This constraint, identified and formalized by Derr and McKeown \[3\], constrains simple and complex sentence generation. Any generation system that generates texts and not just sentences must determine when to generate a sequence of simple sentences and when to combine simple sentences to form a more complex sentence. Derr and McKcown noted that when a speaker wants to focus on a single concept over a sequence of sentences, additional information may need to be presented about some other concept. In such a case, the speaker will make a temporary digression to the other concept, but will immediately continue to focus on the first. To signal that focus does not shift, the speaker can use subordinate sentence structure when presenting additional information.</Paragraph> <Paragraph position="1"> The focus constraint can be stated formally as follows: assume input of three propositions, PI, P2, and P3 with</Paragraph> <Paragraph position="3"> * verb_phrase (pred (V, NEG, T, AUX}, PR, G, B, PR) -->verb (V, NEG, T, AUX, N, active), nplist (G), pp (to, B).</Paragraph> <Paragraph position="4"> verb_phrase (pred (V, NEG, T, AUX), PR, G, B, G) -->verb (V, NEG, T, AUX, N, passive), pp (to, B), pp (by, PR).</Paragraph> <Paragraph position="5"> verbphrase (pred (V, NEG, T, AUX), PR, G, B, B) -->verb (V, NEG, T, AUX, N, passive), nplist (G), pp (by, PR).</Paragraph> <Paragraph position="6"> arguments indicating focus F1, F2, and F3. 2 The constraint states that if F1 = F3, Fl does not equal F2 and F2 is a constituent of PI, the generator should produce a complex sentence consisting of PI, as main sentence with P2 subordinated to it through P2's focus, followed by a second sentence consisting of P3. In FUG, this constraint can be stated in three parts, separately from other syntactic rules that will apply: I. Test that focus remains the same from PI to P3.</Paragraph> <Paragraph position="7"> 2. Test that focus changes from PI to P2 and that the focus of I'2 is some constituent of PI. 3. If focus does shift, form a new constituent, a complex sentence formed from PI and P2, and order it to occur before P3 in the output (order is specified by patterns in FUG).</Paragraph> <Paragraph position="8"> Figure 5 presents the constraint, while Figure 6 shows the construction of the complex sentence from P1 and P2. Unification and paths simplify the representation of the constraint. Paths, indicated by angle brackets (<>), allow the grammar to point to the value of other constituents. Paths and unification are used in conjunction in Part 1 of Figure 5 to state that the value of focus of P1 should unify with the 2In the systems we are describing, input is specified in a case frame formalism, with each pmpositioa indicating protagonist (prot), goal, beneficiary (benef), verb, and focus. In these systems, iexical choice is made before entering the grammar, thus each of these arguments includes the word to be used in the sentence.</Paragraph> <Paragraph position="9"> (alt % Is focus the same in P1 and P3? 1.((PI ((focus <^ P3 focus>))) % Does not apply if focus % stays the same 2. (alt (((PI ((focus <^ P2 focus>)))) % Form new constituent from P1 % and P2 and order before P3.</Paragraph> <Paragraph position="10"> 3. (pattern (PIP2subord P3) ) (P3 (cat s) ) % New constituent is of category % subordinate.</Paragraph> <Paragraph position="11"> (PIPRsubord % Place P2 focus into % subordinate as it will % be head of relative clause.</Paragraph> <Paragraph position="13"> value of focus of P3 (i.e., these two values should be equal). 3 Unification also allows for structure to be built in the grammar and added to the input. In Part 3, a new constituent P1P2subord is built. The full structure will result from unifying P1P2aubord with the category subordinate, in which the syntactic structure is represented. The grammar for this category is shown in Figure 6. It constructs a relative clause 4 from P2 and attaches it to the constituent in P1 to which focus shifts in 1:'2. Figure 7 shows the form of input requixed for this constraint and the output that would be produced.</Paragraph> <Paragraph position="14"> as (focus <P3 focus>) determines the value for focus by searching for an amibute P3 in the list of am'ibutes (or Functional Description if'D)) in whichfocus occurs. The value of P3'sfocua is then copied in as the value of focus. In order to refer to attributes at any level in the m~e formed by the nestsd set of FDs, the formalism includes an up-arrow (^). For example, given the attribum value pair (attrl <^ am'2 attt3>), the up-arrow indica,,'s that the system should look for attr2 in the FD containing the FD ofattrl. Since P3 occurs in the FD containing PI, an up-arrow is used to specify that the system should look for the attribute P3 in the FD containing PI (i.e., one level up). More up-arrows can be used if the fast attribute in the path occurs in an even higher level FD.</Paragraph> <Paragraph position="15"> 4The entire grammar for relative clauses is not shown. In particular, it would have to add a relative pronoun to the input.</Paragraph> <Paragraph position="16"> In the DCG formalism, the constraint is divided between a rule and a test on the rule. The rule dictates focus remain the same from P1 to P3 and that P2's focus be a constituent of P1, while the test states that P2's focus must not equal Pl's. Second, because the DCG is essentially a context free formalism, a duplication of rules for three different cases of the construction is required, depending on whether focus in P2 shifts to protagonist, goal or beneficiary of PI. Figure g shows the three rules needed. Each rule takes as input three clauses (the first three clauses listed) and produces as output a clause (the last listed) that combines P1 and P2. The test for the equality of loci in Pl and P3 is done through PROLOG unification of variables. As in the previous DCG example, arguments with the same variable name must be equal. Hence, in the first rule, focus of the third clause (FI) must be equal to focus of the first clause (also FI). The shift in focus from P1 to P2 is specified as a condition (in curly brackets {}). The condition in the first rule of Figure 8 states that the focus of the second clause (PR l) must not be the same as the focus of the fast clause if:l).</Paragraph> <Paragraph position="17"> Note that the rules shown in Figure 8 represent primarily the constraint (i.e., the equivalent of Figure 5).</Paragraph> <Paragraph position="18"> The building of structure, dictating how to construct the relative clause from P2 is not shown, although these rules do show where to attach the relative clause. Second, note that the conswaint must be duplicated for each case where focus can shift (i.e., whether it shifts to pint, goal or beneficiary).</Paragraph> </Section> <Section position="2" start_page="99" end_page="100" type="sub_section"> <SectionTitle> 1.4 Comparisons With Other Generation System Grammars </SectionTitle> <Paragraph position="0"> The DCG's duplication of rules and constraints in the examples given above results because of the mechanisms provided in DCG for representing conswaints. Constraints on consdtuent ordering and structure are usually expressed in the context free portion of the granmmr;, that is, in the left and fight hand sides of rules. Constraints on when the context free rules should apply are usually expressed as tests on the rules. For generation, such constraints include pragmatic constraints on free syntactic choice as well as any context sensitive constraints. When pragmatic constraints apply to more than one ordering constraint on constituents, this necessarily means that the constraints must be duplicated over the rules to which they apply. Since DCG allows for some constraints to be represented through the unification of variables, this can reduce the amount of duplication somewhat.</Paragraph> <Paragraph position="1"> FUG allows pragmatic constraints to be represented as meta-rules which are applied to syntactic rules expressing ordering constraints through the process of unification. This is similar to Chomsky's \[2\] use of movement and focus rules to transform the output of context free rules in order to avoid rule duplication. It may be possible to factor out constraints and represent them as recta-rules in a DCG, but this would involve a non-standard implementation of the DCG (for example, compilation of the DCG to another grammar formalism which is capable of representing constraints as meta-rules).</Paragraph> <Paragraph position="2"> /* Focus of P2 is protagonist of PI (PR1) Example: the cat was petted by the girl that brought it. the cat purred */ foc_shift (clause (VI, PR1, GI, B1, FI), clause (V2, PR2, G2, B2, PRI) , clause (V3, PR3, G3, B3, F1), Other grammar formalisms that express constraints through tests on rules also have the same problem with rule duplication, sometimes even more severely. The use of a simple augmented context free grammar for generation, as implemented for example in a bottom-up parser or an augmented transition network, will require even more duplication of constraints because it is lacking the unification of variables that the DCG includes. For example, in a bottom-up generator implemented for word algebra problem generation by Ment \[10\], constraints on wording of the problem are expressed as tests on context free rules and natural language output is generated through actions on the rules. Since Ment controls the linguistic difficulty of the generated word algebra problem as well as the algebraic difficulty, his constraints determine when to generate particular syntactic constructions that increase wording difficulty. In the bottom-up generator, one such instructional consuaint must be duplicated over six different syntactic rules, while in FUG it could be expressed as a single constraint. Ment's work points to interesting ways instructional constraints interact as well, further complicating the problem of clearly representing constraints.</Paragraph> <Paragraph position="3"> In systemic grammars, such as NIGEL \[6\], each choice point in the grmm'nar is represented as a system. The choice made by a single system often determines how choice is made by other systems, and this causes an interdependence among the systems. The grammar of English thus forms a hierarchy of systems where each branch point is a choice. For example, in the part of the grammar devoted to clauses, one of the Rrst branch points in the grammar would determine the voice of the sentence to be generated.</Paragraph> <Paragraph position="4"> Depending on the choice for sentcmce voice, other choices for ovcrali sentence structure would be made. Constraints on choice arc expressed as LISP functions called choosers at each branch point in the grammar. Typically a different chooser is written for each system of the grammar. Choosers invoke functions called inquiry operators to make tests determining choice. Inquiry operators are the primitive functions representing constraints and are not duplicated in the grammar. Calls to inquiry operators from different choosers, however, may be duplicated. Since choosers are associated with individual syntactic choices, duplications of calls is in some ways similar to duplication in augmented context free grammars. On the other hand, since choice is given an explicit representation and is captured in a single type of rule called a system, representation of constraints is made clearer. This is in contrast to a DCG where constraints can be distributed over the grammar, sometimes represented in tests on rules and sometimes represented in the rule itself. The systcmic's grammar use of features and functional categories as opposed to purely syntactic categories is another way in which it, like FUG, avoids duplication of rules.</Paragraph> <Paragraph position="5"> It is unclear from published reports how constraints are represented in MUMBLE \[7\]. Rubinoff\[16\] states that constraints are local in MUMBLE, and thus we suspect that they would have to be duplicated, but this can only be verified by inspection of the actual grammar.</Paragraph> </Section> </Section> <Section position="4" start_page="100" end_page="101" type="metho"> <SectionTitle> 2 Improved Efficiency </SectionTitle> <Paragraph position="0"> Our implementation of FUG is a reworked version of the tactical component for TEXT \[9\] and is implemented in PSL on an IBM 4381 as the tactical component for the TAILOR system \[11; 12\]. TAILOR's FOG took 2 minutes and 10 seconds of real time to process the 57 sentences from the appendix of TEXT examples in \[9\] (or 117 seconds of CPU time). This is an average of 2.3 seconds real time per sentence, while TEXT's FUG took, in some cases, 5 minutes per sentence. 5 This compares quite favorably with Rubinoff's adaptation \[16\] of MUMBLE\[7\] for TEXT's strategic component. Rubinoff's MUMBLE could process all 57 sentences in the appendix of TEXT examples in 5 minutes, yielding an average of 5 seconds per sentence.</Paragraph> <Paragraph position="1"> SWe use real times for our comparisons in ordea to make an analogy with Rubinoff \[16\], who also used real times.</Paragraph> <Paragraph position="2"> Thus our new implementation results in yet a better speed-up (130 times faster) than Rubinoff's claimed 60 fold speed-up of the TEXT tactical component.</Paragraph> <Paragraph position="3"> Note, however, that Rubinoff's comparison is not at all a fair one. First, Rubinoff's comparisons were done in real times which are dependent on machine loads for time-sharing machines such as the VAX-780, while Symbolics real time is essentially the same as CPU time since it is a single user workstation. Average CPU time per sentence in TEXT is 125 seconds. 6 This makes Rubinoff's system only 25 times faster than TEXT. Second, his system runs on a Symbolics 3600 in Zctalisp, while the original TEXT tactical component ran in Franzlisp on a VAX 780. Using Gabriel's benchmarks \[4\] for Boyer's theorem proving unification based program, which ran at 166.30 seconds in Franzlisp on a Vax 780 and at 14.92 seconds in Symbolics 3600 Commonl.isp, we see that switching machines alone yields a 11 fold speed-up. This means Rubinoff's system is actually only 2.3 times faslcr than TEXT.</Paragraph> <Paragraph position="4"> Of course, this means our computation of a 130 fold speed-up in the new implementation is also exaggerated since it was computed using real time on a faster machine too. Gabriel's benchmarks arc not available for PSL on the IBM 4381, 7 but we are able to make a fair comparison of the two implementations since we have both the old and new versions of FUG running in PSL on the IBM. Using CPU times, the new version proves to be 3.5 times faster than the old tactical component, e Regardless of the actual amount of spc~-up achieved, our new version of FUG is able to achieve similar speeds to MUMBLE on the same input, despite the fact that FUG uses a non-deterministic algorithm and MUMBLE uses a deterministic approach. Second, regardless of comparisons between systems, an average of 2.3 seconds real time per sentence is quite acceptable for a practical generation system.</Paragraph> <Paragraph position="5"> We were able to achieve the speed-up in our new version of FUG by making relatively simple changes in the unification algorithm. The fast change involved immediately selecting the correct category for unification from the grammar whenever possible. Since the grammar is represented as a llst of possible syntactic categories, the first stage in unification involves selecting the correct category to unify with the input. On fast invoking the unifier, this means selecting the sentence level category and on unifying each constituent of the input with the grammar, this means selecting the category of the constituem. In the old grammar, each category was unified successively until the correct one was found. In the current implementation, we retrieve the correct category immediately and begin C/'rhis was computed using TEXT's appendix where CPU time is given in units corresponding to 1/60 second.</Paragraph> <Paragraph position="6"> &quot;/Gabriel's benchmarks are available only for much larger IBM, mainfranzs.</Paragraph> <Paragraph position="7"> SThe new version took 117 CPU seconds to process all sentences, or 2 CPU seconds per sentence, while the old version took 410 CPU seconds to process all sentences, or 7 CPU seconds per sentence.</Paragraph> <Paragraph position="8"> unification directly with the correct category. Although unification would fail immediately in the old version, directly retrieving the category saves a number of recursive calls.</Paragraph> <Paragraph position="9"> Unification with the lexicon uses the same technique in the new version. The correct lexicai item is directly retrieved from the grammar for unification, rather than unifying with each entry, in the lexicon successively.</Paragraph> <Paragraph position="10"> Another change involved the generation of only one sentence for a given input. Although the grammar is often capable of generating more than one possible sentence for its input 9, in practice, only one output sentence is desired. In the old version of the unifier, all possible output sentences were generated and one was selected. In the new version, only one successful sentence is actually generated.</Paragraph> <Paragraph position="11"> Finally, other minor changes were made to avoid recursive calls that would result in failure. Our point in enumerating these changes is to show that they arc extremely simple. Considerably more speed-up is likely possible if further implementation were done. In fact, we recently received from ISI a version of the FUG unifier which was completely rewritten from our original code by Jay Myers. It generates about 6 sentences per seconds on the average in Symbolics Commonlisp. Both of these implementations demonstrate that unification for FUG can be done efficiently.</Paragraph> </Section> class="xml-element"></Paper>