File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/87/p87-1034_metho.xml
Size: 42,244 bytes
Last Modified: 2025-10-06 14:11:59
<?xml version="1.0" standalone="yes"?> <Paper uid="P87-1034"> <Title>REVISED GENERALIZED PHRASE STRUCTURE GRAMMAR</Title> <Section position="4" start_page="0" end_page="244" type="metho"> <SectionTitle> 2 Eliminating Intractability in GPSG </SectionTitle> <Paragraph position="0"> Ristad (1986a) examines the computational complexity of two components of the GPSG formal system (metarules and the feature system) and shows how each of these systems can lead to computational intractability. Rlstad also proves that the universal recognition problem for GPSGs is EXP-POLY hard, and intractable. 2 In another words, the fastest recognition algorithm for GPSGs can take more than exponential time.</Paragraph> <Paragraph position="1"> These results may appear surprising, given GPSG's weak context-fres generative power. They also raise some important computational and linguistic questions: why GPSG-Recognition is so difficult, what aspects of the GPSG formalisms cause intractability, and whether they are linguistically necessary. I begin with an outline of the GPSG formal system, as presented in Gazdar, Klein, Pullum, and Sag (1985), GKPS hereafter. Subsequently, I identify and remove the excess computational power provided by each formal device.</Paragraph> <Section position="1" start_page="0" end_page="243" type="sub_section"> <SectionTitle> 2.1 Overview of GPSG Formalisms </SectionTitle> <Paragraph position="0"> From the perspective of classic formal language theory, a GPSG may be thought of as a grammar for generating a context-free grammar. The generation process begins with immediate dominance (ID) rules, which are context-free productions with unordered right-hand sides. An important feature of ID rules is that nonterminals in the rules are not atomic symbols (for example, NP). Rather, GPSG nonterminals are sets of \[.feature, feature-value\] pairs. For example, IN +\] is a \[feature, feature-value\] pair, and the set { IN /\], IV -\], \[BAR 2\] } is the GPSG representation of a noun phrase. Next, metarules apply to the ID rules, resulting in an enlarged set of ID rules. Metarules have fixed input and output patterns containing a distinguished multiset variable W in addition to constants. If an ID rule matches the input pattern under some specialization of the variable W, then the metarule generates an ID rule corresponding to the metarule's output pattern under the same specialization of W. For example, the passive</Paragraph> <Paragraph position="2"> says that &quot;for every ID rule in the grammar which permits a VP to dominate an NP and some other material, there is also a rule 2The universal recognition problem most accurately reflectg the difficulty of processing a grammatical formalism because it incorporates the gr-4mmar in the problem statement, as explained in Barton, Berwick, and Ristad (x987).</Paragraph> <Paragraph position="3"> in the grammar which permits the passive category VP \[PAS\] to dominate just the other material from the original rule, together (optionally) with a PP\[by\] ~ (GKPS:59). In Ristad (1986a), the finite closure problem is used to determine the cost of metarule application. Principles of universal feature instantiation (UFI) apply to the resulting enlarged set of ID rules, defining a set of phrase structure trees of depth one (local trees). One principle of UFI is the head feature convention, which ensures that phrases are projected from lexical heads. Informally, the head feature convention is GPSG's ~-theory. Ristad (1986a) uses the eatego~j mem~ersA~p problem to determine, in part, the cost of mapping I'D rules to local trees. Finally, linear precedence statements are applied to the inst~ntiated local trees. LP statements order the unordered daughters in the instantiated local trees. The ultimate result, therefore, is a set of ordered local trees, and these are equivalent to the context-free productions in a context-free grammar. The resulting context-fres grammar derives the language of the GPSG.</Paragraph> <Paragraph position="4"> The process of assigning structural descriptions to utterances consists of two steps in GPSG: the projection of ID rules to local trees and the derivation of utterances from nonterminals, using the local trees. Accordingly, formal devices may supply resources to either process.</Paragraph> </Section> <Section position="2" start_page="243" end_page="243" type="sub_section"> <SectionTitle> 2.2 Theory of Syntactic Features </SectionTitle> <Paragraph position="0"> In current GPSG theory, syntactic categories (nonterminals) encode linguistic relations as feature-value pairs. If a relation is true of two categories in a phrase structure tree, then the relation will be encoded in every category on the unique path between the two categories. The primary computational resource provided by the theory of syntactic features is polynomial space, primarily due to the large number of possible syntactic categories arising from finite feature closure. Ristad (1986a) observes that finite feature closure admits a surprisingly large number of possible categories: 9(36&quot;bT) where a is the number of atomic-valued features and b the number of category-valued features. In fact, there are more that 107:~ categories in the GKPS system.</Paragraph> <Paragraph position="1"> Fortunately, the full power of embedded categories does not appear to be linguistically necessary because no category-valued feature need ever contain another, s In GPSG, there are three category-valued features: SLASH, which marks the path between a gap and its filler with the category of the filler; AGR, which marks the path between an argument and the functor that syntactically agrees with it (between the subject and matrix verb, for example); and WH, which marks the path between a ~#h-word and the minimal clause that contains it with the morphological type of the ~h-word. AGR will never contain SLASH because a functor (verb or predicate) will never select a gap or a constituent containing a gap as it's argument. Conversely, SLASH will never be required to contain AGR because such a category corresponds to %he following imaginary (and rather weird) case: Suppose we found a language in which finite verb phrases could be fronted over an unbounded domain provided that they were in the agreement form associated with third-person-singular NP controllers&quot; (PuUum, personal communication). Similarly, because the value of ~ is the category of a wh- noun phrase, and because ~#~- nomsLet f and g be any distinct category-valued features. I am arguing that although f may ~ppear inside g in some L~nguage, f will never be reqm'regto appear inside g.</Paragraph> <Paragraph position="2"> inals never contain gaps, WH can never contain SLASH or AGR. In point of fact, no category embeddings appear in the GKPS grammar for English, and it is difficult to see how they would appear in a GPSG for any other natural language.</Paragraph> <Paragraph position="3"> The obvious revision, then, is unit feature closure: to limit category-valued features to containing only O-level categories. (0level categories do not contain any category-valued features). I adopt this strongly falsifiable constraint in RGPSG. The depth of category-embedding is purely an empirical issue, and hence unit closure is not ad hoe. The other revision is primarily notational: any RGPSG feature f may assume the distinguished values noBind or unbound in addition to those values determined by p(f). A noBSnd value indicates that the feature may not receive a value in an extension of the given category, while unbound indicates that the feature does not currently have a value, and may receive one in extension.</Paragraph> </Section> <Section position="3" start_page="243" end_page="244" type="sub_section"> <SectionTitle> 2.3 Immediate Dominance/Linear Precedence </SectionTitle> <Paragraph position="0"> GPSG's ID/LP format models certain word order phenomena, such as the head parameter and some free word order facts. An ID rule is a context-free production Co -'* CI,C2 .... ,C~ whose left-hand side (LHS) is the mother category and whose right-hand side (RHS) is an unordered multlset of daughter categories, some of which may be designated as head daughters. The LHS immediately dominates the unordered RHS in a tree of depth one (a local tree).</Paragraph> <Paragraph position="1"> ID rules significantly increase the time resources available to the GPSG derivation process in four related ways. First, a derivation step is nondeterm/nistlc because a category may immediately dominate more than one RHS. Second, the derivation process may alternate between a derivation step involving the ID rules C --~ Ct \[ ... I C~ that corresponds to an OR-transition (only one of k possible successors must yield a terminal string) and a derivation step involving an ID rule C ~ CI,C2,... ,Ce that corresponds to an AND-transition (all k successors must yield terminal strings). These two devices introduce lexical and structural ambiguity. As is well-known, ambiguity is a central prop-erty of natural languages. Therefore, I consider this aspect of ID rules linguistically essential, and it will be retained in RGPSG. Third, unrestricted null transitions in ID rules are a source of intractability because they allow GPSGs to generate enormous phrase structure trees whose yield is the empty string (see Ristad, 1986a). Thus, a parser that used such a grammar must nondeterministically postulate elaborate phrase structure in between its input tokens. The indisputable unnaturalness of this ability motivates me to greatly restrict null transitions in RGPSG.</Paragraph> <Paragraph position="2"> Fourth, the multiset RHS of an ID rule contributes to a large space of local phrase structure trees: an ID rule with s a RHS of cardinality b can, if unconstrained by LP statements, correspond to b! ordered productions. In parsing practice, this can cause a combinatorial explosion in a context-free parser's state space (see Barton, 1985). In addition to causing nondeterrninism in any GPSG-based parser, the multiset RHS confers on GPSG the ability to count nonterminals. The apparent artificiality of this device, as discussed in Barton, Berwick, and Ristad (1987:260261), will motivate me to adopt a substantive constraint of short ID rules in RGPSG (binary branching, for example). 4 RGPSG ID rules have exactly one mother and at least one head daughter. The heads are separated notationally from the non-heads by a colon, and appear to the left of the colon. The mother and all head daughters are implicitly specified for \[NULL -\]. For example, the RGPSG headed ID rule 2 corresponds to the GPSG ID rule 3.</Paragraph> <Paragraph position="4"> There is only one lexical element for the null string, and it is universal across all grammars:</Paragraph> </Section> </Section> <Section position="5" start_page="244" end_page="247" type="metho"> <SectionTitle> X2 \[SLASH X,~I, NULL +\] l &quot;&quot;* </SectionTitle> <Paragraph position="0"> Co-subscripting indicates that the two X,~ categories must be identical in any legal projection of the rule, with the exception of the \[NULL /\] and SLASH specifications. This restricted ID rule format, when coupled with a restriction on metarules that prevents them from affecting head daughters, prevents head daughters from ever being erased in a RGPSG derivation. Thus, null transitions are effectively eliminated from RGPSG.</Paragraph> <Paragraph position="1"> An ordered production is an ID rule whose daughters are completely linearly ordered, that is, a string of daughter categories rather than multisets of head and nonhead daughters. An ordered production is LP-occeptable if all LP statements in the RGPSG are true of it.</Paragraph> <Paragraph position="2"> The RGPSG ID/LP formalism does not contain formal constraints sufficient to guarantee polynomial-time recognition, although the linguistically justified use of short ID rules can render ID rules tractable, because ID/LP grammars with bounded rules can be parsed in time polynomial in the grammar si~.e, s</Paragraph> <Section position="1" start_page="244" end_page="245" type="sub_section"> <SectionTitle> 2.4 Metarules </SectionTitle> <Paragraph position="0"> Metarules are lexical redundancy rules. Formally, they are functions that take le=ical ID rules--ID rules with a lexical head--to 'The binary branching constraint is independently motivated by the llnguistic arguments of Kayne (1981) und others. In that work, Kayne argues that the pnth from a governed category to its governor (for example, from an anaphor to its antecedent) must be unamblguou~--informally put, &quot;an unambiguous path is a path such that, in tracing it out, one is never forced to m~.ke a choice between two (or more) unused branches, both pointing in the same direction&quot; (Kayne 1981:146). The unambiguous path requirement sharply constrains fan-out in phra~ structure trees because n-ary branching, for n > 2, is only possible when none of the rt sister nodes must govern any other nodes in the phrase structure tree.</Paragraph> <Paragraph position="1"> s~ the length bound for natural language graznmars is the constant b, then any \]I)/LP grammar G cffin be converted into a strongly-equivalent CFG G ~, of sise 0(IG I . b!) = $(IGI) by simply expanding out the constant number of linear precedence po~ibilitlee. In the GKP$ and RGPSG grammars for English, b = 3 becau~ double object constrnctions (\[g/us NP NP\], for example) are atmigued a fiat, ternary branching structure. (I ignore the iterating coordination schema, which licenses rules with unbounded right-hand sides.) It is important, however, that the short rules reflect a genuine constraint and that the grammar does not use some other mechanism to get the effect of longer rules (feature instantiation, for example).</Paragraph> <Paragraph position="2"> sets of lexical ID rules. See the GKPS passive metarule above.</Paragraph> <Paragraph position="3"> The GKPS grammar for English also includes metarules for subject-aux inversion, extrapusition, and transitivity alternations. The complete set of ID rules in a GPSG is the maximal set that can be arrived at by taking each metarule and applying it to the set of rules that did not themselves arise from the application of that metarule. This maximal set is called the finite closure FC(M, R) of a set R of lexical ID rules under a set At f of metarules.</Paragraph> <Paragraph position="4"> Metarules can increase the time and space resources available to the derivation process by introducing null transitions and ambiguity in ID rules and by increasing the space of ID rules more than exponentially. They can also increase the cost of the projection process itself: finite closure is nondeterministic (NP-hard, in fact) because metsrules are applied to ID rules nondeterministically. null Unrestricted null transitions are both linguistically and computationally undesirable. Moreover, the ability of metarules to affect lexicai head daughters is in direct conflict with their linguistic purpose: ato express generalizations about the subcategorization possibilities of lexical heads, n (GKPS:59) Unrestricted metarules can destroy the relation between a phrase and its lexicai head, and thereby violate ~-theory. The first step in revising rectarules is to restrict them to on/y affect nonhead daughters in lexical ID rules. Because of this change, metarules cannot alter the implicit \[NULL o\] specification on the head daughters. Therefore, once a category is expanded in a derivation, it must be lexlcal\]y realized in the derived string. This formal constraint ensures that the empty string does not have elaborate phrase structure in RGPSG.</Paragraph> <Paragraph position="5"> Metarule finite closure generates many linguistically incorrect ID rules that must be excluded by other GPSG devices (FCRs, for example). The GKPS grammar for English contains six metarules; out of approximately 1944 possible metarule interactions in principle, only two such interactions appear to be productive (passive followed by subject-aux inversion or slash termination metarule 1).6 Therefore, the second metarule restriction adopted by RGPSG is biclosure, instead of finite closure, r SGiven a set of ,~ metarules, the number of possible metarule interactions is the number of ways to pick n or less metarules from the set, where order matters and repetitions are not allowed. That number is given by the total number of possible koeslections from the a metarules, where k v-4ries from 0 (no metarnles apply) to ~ (any combination of all metaruies apply). Thus, the number of possible interactions j'(n) is: ~-~:o (b--,)l ~ b!-e). This k not the size of metarule finite closure, because it does not consider the pouibillty of a metarnle matching an I'D rule in more than one wuy.</Paragraph> <Paragraph position="6"> TMetarule biclosure does not overgenerate as badly as finite closure, and thereby promotes descriptive adequacy at the expense of some explanatory power. Biclosure has an edge in descriptive economy (explanatory power) over unit closure because simpler (and less) metarules are needed with biclosure. Thus, the length of metarnle derivations is not totally ad hoc because it is subject to scientific criterion.</Paragraph> </Section> <Section position="2" start_page="245" end_page="246" type="sub_section"> <SectionTitle> 2.5 Principles of Universal Feature Instantiation </SectionTitle> <Paragraph position="0"> The ID rules obtained by taking the finite closure of the meterules on the ID rules are proiected to local phrase structure trees.</Paragraph> <Paragraph position="1"> Abstractly, this process establishes the connection between those relations encoded in ID rules (for example, domination, subcategorization, case, modification, and predication) and the nonlocal linguistic relations. Local trees are projected from ID rules by mapping the categories in a rule into legal extensions of those categories in the projected local tree.</Paragraph> <Paragraph position="2"> Principles of aniverea/feature instantiation (UFI) constrain this projection by requiring categories in a local tree to agree in certain feature specifications when it is possible for them to do so. For example, the head feature convention (HFC) requires the mother to agree with all head features that the head daughters agree on, if agreement is possible. The HFC expresses ~-theory in part, requiring a phrase to be the projection of its head. It also plays a central role in the GPSG account of coordination phenomena, requiring the conjuncts in a coordinate structure to all participate in the same linguistic relations with the rest of the sentence. The two other principles of UFI are the control agreement pr/nc/ple and the foot feature principle. The control agreement principle represenm the GPSG theory of predicate-argument relations; informally, it requires predicates to agree with their arguments (for example, verb phrases must agree with their subject NPs in English). The foot feature principle prorides a partial account of gap-filler relations in the GPSG system, including parasitic gaps and the binding facts of reflexive and reciprocal pronouns; it plays a role strikingly similar to that of Pesetsky's (1982) path theory and Chomsky's (1986) binding and chain theories, s Informally, the foot feature principle ensures that certain syntactic information is not lost. ~Exceptional ~ feature specifications are those feature specifications in an ID rule that should agree by virtue of a principle of UFI, but are unable to without changing a feature specification inherited from the ID rule.</Paragraph> <Paragraph position="3"> The three principles of UFI all cause intractability because they provide the derivation process with reusable space resources.</Paragraph> <Paragraph position="4"> First, each principle of UFI can enforce nonlocal feature agreement in phrase structure. Ristad (1986b) shows how this causes NP-hardnees, when coupled with lexical ambiguity or null transitions. A related source of intractability is that the projection of ID rules to local trees can create an astronomical space of local trees, which in turn increases parser search space. These two sources of intractability cannot be eliminated because they are essential to GPSG's account of linguistic agreement among aThe possibility of expreuing the control agreement and foot feature principles as local constI-sints on nonlocal relations ~llm out from the central role of c-command, or equivalently unambiguous paths, in binding theory. C-command k a local relation, in fact the primary source of locality in phrase structure (see Berwick and Wexler 1982). Similarly, the possibility of encoding multiple g-sp-filler relations in one feature specification of one category corresponds to the &quot;no crossing ~ constraint of path theory. Peeetsky (1982:556) compares the predictions of path theory and principles of UFI when the two diverge in cases of double extraction (for example, a probls~r~ thaf~ \] know ~vho i to \[~ talk to s i about ell) from coordinate structures. He concludes that ithe apparent simplicity of the slash category solution fades when more complex cases are considered.&quot; conjuncts and between predicates and their arguments, gaps and their fillers, and phrases and their lexical heads.</Paragraph> <Paragraph position="5"> The use of exceptional feature specifications in these principles allows a derivation to reuse the space resources provided by the ID rules and theory of syntactic features. In the reduction of Ristad (1986a), head features encode an alternating Turing machine tape. The HFC is used to transfer the tape contents for an ATM configuration Co (represented by the mother) to its immediate successors C1, C2,... ,Ck (the head daughters). The configurations Co, C1 .... ,Ct have identical tapes, with the critical exception of one tape square. If the HFC enforced absolute agreement between the head features of the mother and head daughters, the polynomial space ATM computation could not be simulated in this manner.</Paragraph> <Paragraph position="6"> Principles of universal feature instantiation in RGPSG all preserve a simple invariant across all ID rules. They are monotonic; that is, they never delete or alter existing feature specifications. The head feature convention, for example, ensures that the mother agrees exactly with all head feature specifications that the head daughters agree on, regardless of where the specifications come from.</Paragraph> <Paragraph position="7"> Principles of UFI are first applied to the ID rule output of metarule unit closure. After this initial application, each principle always applies, governing the well-formedness of the ID rule extension relation. The resulting ID rules derive utterances in the language generated by the RGPSG.</Paragraph> <Paragraph position="8"> Head feature convention. The head feature convention enforces the invariant that the mother is in absolute agreement with all head features on which the head daughters agree. It also requires the BAR value on a head daughter to be less than or equal to the BAR value on the mother. HEAD contains exactly those features that must be equivalent on the mother and head daughters of every ID rule. 9 HEAD = {AGR, ADV, AUX, INV, LOC, N, N'FORM, PAS, PAST, PER, PFORM, PLU, PRD, V, VFORM} Control agreement principle. The control agreement principle (CAP)differs from the HFC in that it establishes equivalences (//nks) between the categories in an ID rule: when two categories are linked in an ID rule, the two categories must be identical in any legal extension of that rule. Links are calculated immediately after the HFC has applied to the ID rules for the first time; once a link is established in an ID rule, it cannot be changed or undone. Ideg The first part of the CAP calculates control relations between categories, while the second part of the CAP establishs degIn order to properly account for feature inetantiation in the binary and Rerating coordination schemata, the binary head (BHEAD) features BAR, SUB J, SUBCAT, and SLASH are considered to be head features for the purposes of the HFC in all nonlexlcal, multiply-headed ID rules.</Paragraph> <Paragraph position="9"> loin GI~s, only head feature specifications and inherited foot feature specificationJ determine the semantic types relewant to the definition of control. RGPSG simplifies this by considering inherited feature specifications and only some head feature specifications. Alternatively, control relations could be calculated every time the HFC instantiates a feature specification. links using the control relations. In all cases, linking is indicated by co-subscripting.</Paragraph> <Paragraph position="10"> RGPSG control relations are calculated as follows. A predicate is a VP or an instantiation of XP\[/PRD\] such as a predicate nominal or adjective phrase. The control feature of a category C~, where C~(BAR) 7 & 0, is SLASH if C~ is specified for SLASH; otherwise, it is AGR. Control is calculated once and for all immediately after the HFC has applied to the ID rules resulting from metarule unit closure.</Paragraph> <Paragraph position="11"> Let f be the control feature of a category C,. Then 6', is controlled by C~ in a rule if and only if CI(f) = C2, 6'2 ~_ X2, and either the rule is Co -* C, : 6'2 (recall that 6'1 is the head daughter), or the rule is Co -'* Cs : CI,C2, and C0,CI _~ VP.</Paragraph> <Paragraph position="12"> The RGPSG control agreement principle states: In an ID rule</Paragraph> <Paragraph position="14"> * If there is a nonhead predicate C~ with no controller, then link C~(f~) and Co(fo), where f~ and f0 are the control features of C~ and Co, respectively.</Paragraph> <Paragraph position="15"> In the theory of GKPS, the control agreement principle performs subject-verb agreement by enforcing a control relation between the two daughters of the rule</Paragraph> <Paragraph position="17"> In RGPSG, this rule must be stated as S --* X~ \[-SUBJ,AGR X~\] : X~ if we wish to enforce the control relation between the two daughters. Because control relations in RGPSG are static (never recalculated), this control relation exists even if Xg ~ NP. Fortunately, no verb will ever be specified for \[AGR AP\] in the lexicon, and therefore any &quot;questionable&quot; control relations involving an Xg other than NP are ignored at the lexical insertion level.</Paragraph> <Paragraph position="18"> Foot feature principle. The foot feature principle (FFP) requires any foot feature specification instantiated on a daughter category to also be instantiated on the mother. The specification is identical to any instantiation of the same feature on other daughter categories. The FFP ensures that (1) the existence of inherited foot features on any category of an ID rule blocks instantiation of those foot features on any other component category of the rule, and (2) inherited foot features are equivalent across all component categories of the rule. This second condition may be too strong.</Paragraph> <Paragraph position="19"> Because the empty string can be dominated only by a category of the form <*\[NULL /, SLASH a\] in RGPSG, the FFP tries to ensure that every gap will have a unique filler. Unfortunately, it is impossible to truly guarantee recoverability of deletions in RGPSG, because the FFP can only locally constrain the rule-to-tree projection, and not the ID rules themselves. This situation is unavoidable in the GPSG framework, simply because SLASH does not always mark the complete path between a gap and its filler in accepted GPSG analyses. The classic example is the GPSG analysis of subject dependencies, where an S/NP is reanalyzed as a I/P, effectively deleting an NP gap in subject position. In GKPS, this operation is performed by slash termination metarule 2 (GKPS:160-2): \[SLASH NP\] only marks the path from the filler to the mother of the reanalyzed I/P. Another example is the GKPS (pp. 150-152) analysis of missing-object constructions such as John is e~y to please. In missing-object constructions, \[SLASH NP\] only marks the path from the NP gap to the V~\[INF\]/NP dominating to please, failing to continue through the AP easlt to please to the filler Job,. Many sweeping changes would be necessary before the FFP would be able to strictly enforce recoverability of deletions in RGPSG.</Paragraph> </Section> <Section position="3" start_page="246" end_page="246" type="sub_section"> <SectionTitle> 2.6 Marking Conventions </SectionTitle> <Paragraph position="0"> Feature co-occurrence restrictions (FCRs) and feature specification defaults (FSDs) are explicit marking conventions used in the GPSG system both to express language-particular facts and to restrict the overgeneration of other formal devices (both metarule and feature closure}. FCRs and FSDs are restrictive predicates on categories, constructed by Boolean combination of feature specifications. All legal categories must unconditionally satisfy all FCRs. All categories must also satisfy all FSDs, if it is possible to do so without violating an FCR or a principle of universal feature instantiation. For example, FCR i: \[INV /\] D {\[AOX +\] A \[VFORM FIN\]) requires any category that bears the \[INV /\] feature specification to also bear the specifications \[AUX /\] and \[VFORM FIN\].</Paragraph> </Section> <Section position="4" start_page="246" end_page="247" type="sub_section"> <SectionTitle> 2.6.1 Complexity of Marking Conventions </SectionTitle> <Paragraph position="0"> FCRs and FSDs both provide significant resources to the GPSG projection process. First, they allow the projection process to reuse the polynomial space provided by the theory of syntactic features, because they can establish equivalences between the features in a category C and the features in a category contained in C. This ability to apply across embedded categories vastly increases the complexity of the rule-to-tree projection. To see why it is linguistically unnecessary, consider the role of embedded categories. A category-valued feature f expresses a nonlocal linguistic relation between a category C and the one or more categories that bear the feature specification \[f C\]. Thus, in the linguistically relevant cases, every embedded category eventually ~surfaces&quot; in phrase structure, where the marking conventions are free to apply. The one exception to this argument is FCR 13 in the GKPS grammar for English, which applies 'across' an embedded category.</Paragraph> <Paragraph position="1"> FCR 13: \[FIN, AGR NP\] O \[AGR NP\[NOM\]\] In RGPSG, marking conventions may not apply to or across embedded categories. The effect of FCR 13 is achieved in RGPSG by a combination of the simple default SD 2 in section 3.2.2 below and carefully written ID rules.</Paragraph> <Paragraph position="2"> Second, FCRs and FSDs of the &quot;disjunctive consequence&quot; form \[f ~\] D \[fl vl\] V...V \[fn ~,\] compute the direct analog of the NP-complete satisfiability problem: when several such FCRs are used together, the GPSG must nondeterministically try all n featurs-value combinations.</Paragraph> <Paragraph position="3"> Third, the process of applying FSDs to local trees is very complex, in part because it is not informationally encapsulated. Rather than simply considering the (existing) feature specifications in each target category separately, FSD application is affected by the other categories in the ID rule, all principles of universal feature instantiation, and even FCRs.</Paragraph> <Paragraph position="4"> There is no reason to believe that marking conventions need be so powerful and unconstrained. The approach RGPSG takes is to virtually eliminate marking conventions. Rather than stating the internal constraints on categories explicitly (and redundantly), as FCRs do, RGPSG eliminates FCRs altogether. Instead, the constraints FCRs express are implicitly stated in the rest of the grammar -- in the way ID rules and metarules are written, for example. The sole explicit marking convention in RGPSG is the simple defauh (SD). Unlike FCRs and FSDs, SDs are constructive, easy to understand and computationally tractable. Each $D is applied (and may be understood) to each category independent of all other categories and RGPSG formal devices, including other SDs. $Ds are applied to ID rules immediately after the initial application of principles of UFI.</Paragraph> <Paragraph position="5"> An SD contains a predicate and a consequent. The consequent is a list of feature specifications. The predicate is a Boolean combination of truth-values and feature specifications such that if a category C bears or extends a given feature specification, that feature specification is true of C, else false. If the predicate is true of a given category C in a rule and the consequent includes only unbound and unlinked features, then the feature specifications listed in the consequent are instantiated on C. Each SD is applied simultaneously to every top-level category in every rule exactly once, in the order specified by the grammar. Consider the following SD: SD I: if \[SUBCAT\] then \[BAR 0\] If the target category C in a ID rule is specified for the SUBCAT feature, but unspecified for the BAR feature, then the SD wi|\] force the feature specification \[BAR 0\] on C.</Paragraph> </Section> </Section> <Section position="6" start_page="247" end_page="248" type="metho"> <SectionTitle> 3 The Revised Theory </SectionTitle> <Paragraph position="0"> In this section, I explain how the formal subsystems described above fit together. I begin by formally specifying the class of RGPSGs and the languages they generate. I conclude by translating the GKPS analysis of topicalization, explicative pronouns, and parastic gaps to the RGPSG formal system.</Paragraph> <Paragraph position="1"> Figure 1 shows the internal organization of RGPSG. The set of ID rules R' defined by metarule unit closure, UFI, and SD application generates the language of the RGPSG as follows. If R' contains a rule A --. ~' with an extension A' --..1, that satisfies all principles of UFI and is an LP-acceptable ordered production, then for any string of terminals a and nonterminals ~, we write aA'~ =~ a'Tt~. This is a derivation step. The language of an RGPSG contains all terminal strings that can be derived, using</Paragraph> <Paragraph position="3"> G with ID rules R, metarules M, and simple defaults S. The O-bounds show the effect of various formal devices on derived grammar symbol size.</Paragraph> <Paragraph position="4"> the ID rules, from any extension of the distinguished start category. Let =~ be the reflexive transitive closure of =~. Then the language L(G) generated by G is L(G) = { z I z e V~ and 3C * K\[(C ~_ Start) ^ C =~ zl} Ristad (1986b) proves that universal recognition problem for RGPSG is NP-complete, a significant decrease in complexity from the EXP-POLY time hardness of GPSG-Recognition. xl In fact, of the more than ten sources of intractability lurking in GPSG, only two remain in RGPSG -- lexical ambiguity and nonlocal feature agreement. Critically, these two sources of intractability in RGPSG appear to be linguistically essential.</Paragraph> <Section position="1" start_page="247" end_page="248" type="sub_section"> <SectionTitle> 3.1 Efficient RGPSG Parsing </SectionTitle> <Paragraph position="0"> Intractability in RGPSG arises from a particularly deadly combination of feature agreement and lexical ambiguity. Underspecification of categories in ID rules and metarules can be costly.</Paragraph> <Paragraph position="1"> This suggests that limiting the number of head features or the scope of their agreement will mitigate the intractability. An efficient recognition algorithm might approximate grammaticality by failing to transfer all head features through coordinate structures (for example, letting them assume default values instead), or by aborting a parse in the face of excessive lexical or structural ambiguity. Ef~cient parsing techniques based on partial enforcement of UFI are also possible. One such implementation, which propagates feature specifications bottom up using Earley's algorithm, is in progress at Thinking Machines Corporation.</Paragraph> <Paragraph position="2"> ~This decrease in complexity ie significant from both theoretical and practical perspectives. First, N'P-complete problems typically have good average time algorithms, while EXP-POLY problems do not. Next, the fastest recognizer known for GPSGs can require double-exponential time in the worst case, while RGPSG has a simple exponential time recognizer. Finally, NP-complete problems have efficient witneeBes, while EXP-POLY hard problems do not. Thk means that RGPSG parses can always be verified efficiently, while GPSG parsee cannot, in gener~h Barton (1986) proposes a constraint-based computational solution to intractability in the two-level Kinuno morphological analyzer. Intractability arises from unbounded agreement processes in that system, and similar techniques based on constraint propagation may be adapted to create an e/~cient approz~mate parsing algorithm for RGPSG. Tuples of features would correspond to constraint-propagation nodes, while tuples of sets of fcature-values would correspond to node labels; features could receive multiple values in this implementation. Nodes would be connected by both RGPSG ID rules and principles of universal feature instantiation.</Paragraph> </Section> <Section position="2" start_page="248" end_page="248" type="sub_section"> <SectionTitle> 3.2 Linguistic Analysis of English </SectionTitle> <Paragraph position="0"> This section reproduces three of the more intricate linguistic analyses of GKPS in order to illustrate RGPSG's formalisms. To reproduce their comprehensive analysis of English in toto would be a disservice to that work and is beyond the scope of this paper. Instead, Ristad (1986b) provides an RGPSG roughly equivalent to their GPSG for English; the reader should consult GKPS for the accompanying linguistic exposition. In all cases, co-subscripting indicates linking.</Paragraph> <Paragraph position="1"> The rule 4a expands clauses and rule 4b introduces unbounded dependency constructions (UDCs) in English.</Paragraph> <Paragraph position="3"> In both cases the X2 nonhead daughter controls the head daughter, and the control agreement principle links the value of the head daughter's control feature with the 3(2 daughter, creating the ID rules in 5.</Paragraph> <Paragraph position="4"> In the following discussion, \[3s\] and \[3p\] abbreviate \[PER 3, -PLU\] and \[PER 3.+PLU\], respectively. Note that it is impossible to extract any constituent out of the X~ daughter in 5b because the foot feature principle has forced \[SLASH noBind\] on the X~ daughter and its mother. This explains the unacceptabihty of 6 in RGPSG, which is permissible in the theory of GKPS.</Paragraph> <Paragraph position="5"> Now I account for the distribution of the explicative pronouns it and there in infinitival constructions on the basis of postulated ID rules and principles of universal feature instantiation (see GKPS, pp.115-121). The feature specification \[AGR NP\[NFORM all is abbreviated as +a below, where a is it, there, or NORM.</Paragraph> <Paragraph position="6"> The RGPSG for English includes the ID rules 7, and the lexical entries 10. All other nouns are specified for \[NFORM NflRM\] by their lexical entries.</Paragraph> <Paragraph position="7"> (it, NP \[PRO. -PLU. NFORM it;\] ) (there, NP \[PRO, NFORM t;here\] ) (I0) From the ID rules in 7, RGPSG generates the following ID rules.</Paragraph> <Paragraph position="8"> a. VP \[AGRI\] --~ VO \[13.AGRI\] : VP \[INF,AGRI\]</Paragraph> <Paragraph position="10"> The absence of a controlling category allows the CAP to link the AGR values of the mother and VP\[INF\] predicate daughter. The HFC then links the AGR values of the mother and lexical head daughter. SD 1 specifies the head daughter for \[BAR 0\], while The NP daughter controls its VP\[INF\] sister, and the CAP links the AGR value of the VP to its sister NP. SD 2 specifies the mother for \[+NORM\], and the HFC forces this specification on the head daughter.</Paragraph> <Paragraph position="11"> The rules 13 introduce \[+it\] and \[+there\] specifications. Note that 13a is the result of the extraposition metarule on the ID rule 7e.</Paragraph> <Paragraph position="12"> a. VP\[+it\] -* \[20\] :NP, S b. VP\[+it\] -~ \[21\] :(PP\[to\]),S\[FIN\] (13) c. VP \[AGR NP\[*there.PLU ,~\] } --* \[22\] : NP \[PLU c~\] The rules in 13 may only expand the VP daughters of the ID rules 11 and 12 in a derivation (compare their AGR values). Thus, the grammar claims that explicative pronouns only occur in utterances generated using the rules in 13, in combination with the &quot;extending&quot; rules 11 and 12. This describes the following facts from GKPS, p. 120. I~ {It} *There \[continues \[ to bother \[ Lou \]\[ that Robin was chosen \]!! This analysis of parasitic gaps exactly follows the one presented in GKPS on matters of fact. These facts may be questionable, however. Some sentences considered acceptable in GKPS (for example, Kim wondered which models Sandy had sent pictures of to Bill and Kim wondered which authors reviewers of always detested) axe marginal for some native English speakers. Note that both sentences axe marked unacceptable in the GB framework because of subjacency violations.</Paragraph> <Paragraph position="13"> It would be instructional to identify a~nd restrict the computational resources provided by the formal devices in other linguistic theories (for example, lexical-functional grammar, government-binding theory, or morphological theory). Barton, Berwick, and Ristad (1987) explores the utility of complexity analysis in other linguistic domains, although the research strategy reported here is not the focus of that work.</Paragraph> </Section> </Section> class="xml-element"></Paper>