File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/c86-1050_metho.xml

Size: 24,361 bytes

Last Modified: 2025-10-06 14:11:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1050">
  <Title>A Simple Reconstruction of GPSG</Title>
  <Section position="4" start_page="0" end_page="212" type="metho">
    <SectionTitle>
2 The GPSG Axioms
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="211" type="sub_section">
      <SectionTitle>
2.1 A Summary of the Principles
</SectionTitle>
      <Paragraph position="0"> GPSG describes natural languages in terms of various types of constraints on local sets of nodes in trees. Pcrtlncnt to the ensuing discussion are the following:  feature structures as to which arc permissiHe categories; a feature specification defaults (FS1)), which provide values for features that are otherwise unspecified; and, most importantly, 21towever, a caveat is \]n order th:~t the detailed ~u~alysis from this perspective of the full range of GPSG devices (especially immediate dominance (ID) rules, and feature cooccurrence restrictions) is not discussed fillly here, nor do I completely understand them. (See Section 3.4.} And while in a confessional mood, I should add that the Msorlthm given here has not actually been implemented.</Paragraph>
      <Paragraph position="1">  (r) universal feature instantiation principles, which constrain the allowable local sets of nodes in trees; these feature instantiation principles include the head feature convention (HFC), the foot feature principle (FFP), and the control agreement principle (CAP).</Paragraph>
      <Paragraph position="2"> In GPSG all of these constraints are applied simultaneously. A local set of nodes in a tree is admissible under the constraints if mad only if there is some base or derived ID rule (which we will call tile licensing rule) for which the parent node's category is an extension of the left-hand-side category in the rule, and the children arc respective extensions of right-hand-side categories in the rule, and, in addition, the set of nodes simultaneously satisfies all of the separate feature instantiation principles, ordering constraints, etc. By eztension, we mean that the constituent has all the feature values of the corresponding category in the licensing rule, and possibly some additional feature values. The former type of values are called inherited, the latter instantiated.</Paragraph>
      <Paragraph position="3"> The feature instantiation principles are typically of the following form: if a certain feature configuration holds of a local set of nodes, then some other configuration must also be present.</Paragraph>
      <Paragraph position="4"> For instance, the antecedent of the control agreement principle is stated in terms of the existence of a controller and eontrollee which notions are themselves defined in terms of feature configurations. The consequent concerns identity of agreement features.</Paragraph>
    </Section>
    <Section position="2" start_page="211" end_page="212" type="sub_section">
      <SectionTitle>
2.2 Interaction of Principles
</SectionTitle>
      <Paragraph position="0"> Much care is taken in the definitions of the feature instantiation principles (and their ancillary notions such as controller, eontrollee, fl'ce features, privileged features, etc.) to control the complex interaction of the various constraints. For instance, the FFP admits local sets of nodes with 8la~h feature values on parent and child where no such values occur in the licensing ID rule, i.e., it allows instantiation of slash features. But the CAP's above-mentioned definition of control is sensitive to the value of the slash feature associated with the various constituents. A simple definition of the CAP would ignore the source of the slash value, whether inherited, instantiatcd by the FFP, or instantlated in some other manner, llowevcr, the appropriate definition of control needed for the CAP must ignore instantiated slash features, but not inherited ones. Say Gazdar et al.: We must modify the definition of control in such a way that it ignores perturbations of semantic type occasioned by the presence of instantiated FOOT features.</Paragraph>
      <Paragraph position="1"> 12, p. 87\] Thus, the CAP is in some sense blind to the work of the PFP.</Paragraph>
      <Paragraph position="2"> As Gazdar ctal. note, this requirement makes stating the CAP a much more complex task.</Paragraph>
      <Paragraph position="3"> The increased complexity of the principles resulting from this need for tracking the origins of feature values is evident not only in the CAP, but in the other principles as well. The head feature convention requires identity of the head features of parent and !,,ad child. The features ayr and slash--features that can be itfimrited from an ID rule or instantiated by the CAP or FFP, respectively--are head features and therefore potentially subject to this identity condition. However, great care is taken to remove such instantiated head features from obligatory manipulation by the tIFC. This is accomplished by limiting the scope of the ItFC to the so-called free head features.</Paragraph>
      <Paragraph position="4"> Intuitively, the free feature specifications on a category \[the ones the HFC is to apply to\] is the set of feature specifications which can legitimately appear on extensions of that category: feature specifications which conflict with what is already part of the category, either directly, or in virtue of the FCRs, FFP, or CAP, are not free on that category. \[2, p. 95\] That is, the FFP and CAP take precedence (intuitively viewed) over the ItFC.</Paragraph>
      <Paragraph position="5"> Finally, all three principles are seen to take precedence over feature specification defaults in the following quotation.</Paragraph>
      <Paragraph position="6"> In general, a feature is exempt from assuming its default specification if it has been assigned a different value in virtue of some ID rule or some principle of feature instantiation. \[2, p. 1001 Qazdar et al. accomplish this by defining a class of privileged features and excluding such features from tile requirement that they take on their default value. Of course, instantiated head features, slash features, and so forth are all considered privileged. However, a modification of these exemptions is necessary in the case of lexical defaults, i.e., default values instantiated on lexical constituents. We will not discuss here the rather idiosyncratic motivation for this distinction, bnt merely note that Icxical constituent defaults are to be insensitive to changes engendered by the HFC, as revealed in' this excerpt: ftowever, this simpler formulation is inadequate since it entails that lexical heads will always be exempt from defaults that relate to their ttEAD features .... Accordingly, tile final clause needs to distinguish lexical categories, which become exempt from a default only if they covary with a sister, and nonlexieal categories, which become exempt from a default if they covary (in relevant respects) with any other category in the tree. \[2, p. 103\] Thus the interaction of these principles is controlled through complex definitions of the various classes of features they are applicable to. These definitions conspire to engender the following implicit precedence ordering on tire principles, principles earlier in the ordering being blind to the instantiatlons from later principles, which are themselves sensitive to (and exempt from applying to) features instantlated by the earlier principles) CAP ~.4 FFP ~'- FSDuz ~ tlFC &gt;- FSDno,a~ Of course, all ID rules, both base and derived arc subject to all these principles; yet met,rule application is not contingent on instantiations of the base ID rules. Conversely, LP constraints are sensitive to the full range of instantiatcd features. The precedence ordering can thus be extended as follows: SCurrent efforts by at least certain GPSG practitioners are placing the GPSG type of analysis directly in a PATR-like formalism. This formalism, Pollard's head-drlven phrase structure grammar (ltPSG) variant of GPSG, uses a run-time algorithm similar to the one described in this paper \[4\]. Highly suggestive is the fact that the \]IPSG run-time algorithm also happens to order the principles in substantially the same way.</Paragraph>
      <Paragraph position="8"> The existence of such an ordering on the priority of axioms is, of course, not a necessary condition for the coherence of such an aximaatic theory. Undoubtedly, this inherent ordering was not apparent to the developers of the theory, and may even be the source of some surprise to them. Yet, the fact that this ordering exists and is strict leads us to a substantial simplification of the system. Instead of applying all the constraints simultaneously, we might do so sequentially, so that the precedence ordering-tile blindness of earlier principles in the ordering to the effects of later ones emerges simply because the later principles have not yet applied.</Paragraph>
      <Paragraph position="9"> This solution harkens back to earlier versions of GPSG in which the semantics of the formalism was given in terms of compilation of the various principles and constraints into pure context-free I~lles. This compilation process can be combinatorially explosive, yielding vast numbers of context-free rules. Indeed, the whole point of the GI'SG decomposition is to succinctly express generalizations about tile possible phrasal combinations of natural languages, ltowever, by carefully choosing a system for stating constraints on local sets of nodes--a formalism more compact in its representation than context-free grammars--we call compile out the various principles and constraints without risking this explosion in practice.</Paragraph>
      <Paragraph position="10"> The GPSG principles are stated in terms of identities of features. What we need to avoid the combinatorial problems of pure CF rules is a formalism in which such equalities can be stated directly, without generating all the ground instances that satisfy the equalities. What is needed, in fact, is a unification-based grammar formalism \[6\]. We will use a variant of PATR \[5\] as the fi)rmalism into which (H)SG grammars are compiled. In partieular, we assume a version of PATR that has been extended by the familiar decomposition into an immediate-dominance and linear-precedence component. Ttfis will allow us to ignore the LP portion of GPSG for the nonce.</Paragraph>
      <Paragraph position="11"> PATR is ideal for two reasons. First, it is the simplest of the unification-based grammar formalisms, possessing only the apparatus that is needed for this exercise. Second, a semantics for the formalism has been provided, so that, by displaying this compilation, we implicitly provide a semantics for GPSG grammars as well. In the remainder of the paper, we will assume the reader's familiarity with the rudiments of the PATR formalism.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="212" end_page="214" type="metho">
    <SectionTitle>
3 The Compilation Algorithm
</SectionTitle>
    <Paragraph position="0"> We postpone for the time being discussion of the metarules, LP constraints, and feature eooccurrence restrictions, concentrating instead on the central principles of GPSG, those relating to feature instantiation. The following nondeterministic algorithm generates well-formed PATR rules from GPSG ID rules. A GPSG grammar is compiled into the set of PATR rules generated by this algorithm.</Paragraph>
    <Paragraph position="1"> is written in unordered PATR as</Paragraph>
    <Paragraph position="3"> Note that abbreviations (like 5' for l-n, +v, bar2,-t.subj\]) have been mad(; explicit.</Paragraph>
    <Paragraph position="4"> In fact, we will make one change in tile structure of categories (to simplify our restatement of the HFC) by placing all head features under the single feature head in tile corresponding PATR rule. We do not, however, add an analogous fcature foot. s Tiros the preceding rule becomes</Paragraph>
    <Paragraph position="6"> We use an operation addc (read &amp;quot;add conservatively') which adds an equation to a PATI~ rule conservatively, in Ihc sense that the equation is added only if thc equations arc not thereby rendered unsolvable. If addition would yield uosolvability, thcn a weaker set of unifications arc added (conserw~tively)instead, one for each feature in the domain of tile value being equated. For instance, suppose that the operation add~((Xo head) = (Xt head)) is called for, where the domain of the head feature wdues (i.e., the various head features) arc a, b, and c. If the equations in the rule already Sl)ccify that (X0 head a) # (X1 hc~,d a) then this operation would add only the two equations (X0 head b) = (Xl head b) and (Xo head c) = (Xt head c), sincc the addition of the given equation itself would cause rule failure. Thus the earlier constraint of values for the a feature is given precedence over the constraint to be added.</Paragraph>
    <Paragraph position="7"> In the description of the algorithm, a nonempty path p is said to be defined for a feature structure X if and only if p is a unit path (\]) and f ~ dora(X) or p = (h?) and p' is defined for X(f). Our notion of a feature's being defined for a constituent corresponds to the GPSG concepts of being instantiated or of covarying with some other feature.</Paragraph>
    <Paragraph position="8"> As in the previous definition, we will be quite lax with respcct to our notation for paths, using ((a b) c) and (a (b e) ) as synonymous with (ab c) . Also, we will eonsistcntly blur the distinction betwcen a set of equations and the fcaturc structure it determines. (Sce Shleber \[7\] for details of the mapping that makes this possible.)</Paragraph>
    <Section position="1" start_page="212" end_page="212" type="sub_section">
      <SectionTitle>
3.2 The Algorithm Itself
</SectionTitle>
      <Paragraph position="0"> Now our algorithm for compiling a G PSG grammar into a PATR grammar follows:</Paragraph>
    </Section>
    <Section position="2" start_page="212" end_page="213" type="sub_section">
      <SectionTitle>
3.1 Preliminaries
</SectionTitle>
      <Paragraph position="0"> We first observe that a GPSG ID rule is only notationally distinct from an unordered PAI'R rule. Thus, the first step in the algorithm is trivial. For example, the ID rule ,'~ -+ x ~, II\[- ,ub j\] ( RI ) 5But recall that dawh is a head feature and titus would fall tinder the p~th (head slash) .  For each ID rule of GPSG (basic or derived by metarule) X0 &amp;quot;--' X1,... ,X,,: CAP If Xi controls Xy (determined by Type(Xi) and Type(Xj)), then adde((Xl con) = (Xj con)) where (head slash) if (head slash) is defined for Xi con = (head acr) otherwise FFP For each foot feature path p (e.g., (head slash} ), ifp is not defined for Xo , then adde((Xi p) = (Xo p) ) for zero or more i such that 0 &lt; i &lt;_ n and such that p is not defined for X,'. 6 FSDtez For all paths p with a default value, say, d, and for all i such that 0 &lt; i &lt; n, if (Xi bar) = 0 and p is not defined for Xi, then add,((XC/ 1) = d).</Paragraph>
      <Paragraph position="1"> HFC For X/the head of X0, add~((Xi head) = (Xo head)).</Paragraph>
      <Paragraph position="2"> FSDnont~z For all paths p with a default value, say, d, and for all i such that 0 &lt; i _&lt; n, if (Xi bar) # 0 and p is not defined for XC/, then addC/((XC/ J) = d).</Paragraph>
    </Section>
    <Section position="3" start_page="213" end_page="214" type="sub_section">
      <SectionTitle>
3.3 An Example
</SectionTitle>
      <Paragraph position="0"> Let us apply this algorithm to the prcceding rule RI. 7 We start with the PArR equivalent Rs. By checking the existing control relationships in this rule as currently instantiated, we conclude tbat the subject X1 controls the bead )(2. We conservatively add the unification (X2 head agr) = (XI). This can be safely added, and therefore is.</Paragraph>
      <Paragraph position="1"> Next, the FFP step in the algorithm can instantiate the rule further. Suppose we choose to instantiate a slash feature on X2. Then we add the equation (Xo head .dash) = (X2 head slash).</Paragraph>
      <Paragraph position="2"> Lexical default values rcqulre no new equations, since no constituents in the rule are given as 0 bar at this point.</Paragraph>
      <Paragraph position="3"> The tlFC conservatively adds the equation (X0 head) = (X2 head), as )(2 is the head of Xo. But this equation, as it stands, would lead to the entire set of equations being unsolvable, since we already have conflicting values for the head feature subj. Thus the following set of unifications is added instead: s {X0 head n) = (X2 head n) (Xo head v) = (X2 head v) (Xo head bar) = (X2 head bar) {X0 head agr) = (X2 head agr) (Xo head ;nv) = (x2 head in,) 6Several comments are pertinent to this portion of the algorithm. First, it is the FFP portion that is responsible for its nondeterminism. Second, the operation addC/ is actually superfluous here. The equation can simply be added directly, since we have already guaranteed that the pertinent features are not yet instantiated. By a similar argument, we can conclude that only the addc operations in the CAP and HFG are actually necessary. We will use adds, however, for uniformity. Finally~ we assume that an FSD will place the value ~ on any remaining constituents unmarked for foot features.</Paragraph>
      <Paragraph position="4">  Finally, nonlexieal defmdts are introduced for features not in the domains of constituents2 Since the path (head inv) is defined for the constituents X0 and X2, ldeg the defanlt value (i.e., '-' according to FSD 1 of Gazdar et al.) is not instantiated on either constituent. Similarly, the case default value (ace, FSD 10) is not instantiated on tile subject NP. But the conj feature default tt ('~') will be instantiated on all three constituents with the equations</Paragraph>
      <Paragraph position="6"/>
    </Section>
    <Section position="4" start_page="214" end_page="214" type="sub_section">
      <SectionTitle>
3.4 Problems and Extensions
</SectionTitle>
      <Paragraph position="0"> Several problems have been glossed over in tile previous discnssion. First, we have not mentioned the role of LP rules. Two possibilities are available for their interpretation: a &amp;quot;rtm-time&amp;quot; and a &amp;quot;eompile-tlme&amp;quot; interpretation. We can augment tile PATR formalism with I,P rules in tbe same way as Gazdar et al., providing for local sets of nodes to satisfy an unordered PATR rule if and only if the nodes are extensions of elements in the ID rule such that the LP rules are all satisfied. Alteruatively, we can generate at compile time all possible orderlngs of tile unordered rules compatible with ttle LP statements, but this leads us into the problem of interpreting LP statements relative to partially instantiated categories, an issue beyond the scope of tiffs paper.</Paragraph>
      <Paragraph position="1"> Second, feature eooeeurrenee restrictions were ignored in the previous discussion. Again, we will limit ourselves to a brief diseussion of the possibilities. One alternative is to modify the lat-OWe have made the simplifying assumption that feature specification defaults are stated in terms of simple default values for features, rather than the more complex boolean conditions used in the Gazdar et al. text.</Paragraph>
      <Paragraph position="2"> The modifications to allow the more complex FSDs may or may net be straightforward.</Paragraph>
      <Paragraph position="3"> tdegThe value of the feature head on the constituent Xo has the feature inv in its domain because the unification (Xo head iuv} = (X2 head inv) gives as value to (Xo head inv} a variable, the same variable as the value for (X2 head ins) . Thus the path (head lay} is defined for Xo and, similarly, for X:.</Paragraph>
      <Paragraph position="4"> IIWe assume here, contra Gazdar et al., that '~' is a fnll-fledged value in its own right, at least as interpreted in this compilation. Since this value fails to unify with any other value, e.g., '+' or '-', it has exactly the behavior desired, namely, that the feature is prohibited from taking any of its standard values.</Paragraph>
      <Paragraph position="5"> tice of categories relative to which unification is defined tz in such a way that all categories violating the FCILs are simply removed.</Paragraph>
      <Paragraph position="6"> Then unification over this revised lattice will be used instead of the simpler w!rsion and FCRs will automatically always be obeyed. Unfortunately, tire possibility exists that unification over tile revised lattice may not bear the same ordcr-in(lependence properties that characterize unification over the freely-generated lattice.. Of course, if this turns out to be the ease, it c~,~ts doubt on the well-fomMedness of the original Gazdar et al. interpre: tation of FCRs as well, apd tlms is an interesting question to pursue.</Paragraph>
      <Paragraph position="7"> Another alternative involves checking the FCRs at every point in the algorithm, throwing out any rules which violate them at any point. In addition, FCRs would be required to be checked during rau-time as well. This alternative, though more direct, violates the spMt of the enterprise of giving a compilation from the eoml&gt;lex Gazdar et al. formulation to n simpler system.</Paragraph>
      <Paragraph position="8"> A final problenl concerns the ordering of the III&amp;quot;C and the (JAIL The definitions of eontroller and controllee necessary for stating ttw CAP depend on the assigmnent of semantic types tr) constitncnts, which in turn deltend on the configuration of features in {;he categorical. We have ah'eady noted that the features pertinent to tit(! definition of sen(antic type (and hence control) do not include instantiatcd fi)ot featttrcs. Indeed, Gazdar et al.</Paragraph>
      <Paragraph position="9"> claim that &amp;quot;it is just IlEAl) feature specifications (other than those which are also I?OOT feature specifications) and inherited FOOT fl,aturc specifications that determitre the semantic types relevant to the definition of control.&amp;quot; \[2, p. 87\] Unfortunately, the orderiug we have giveu lu'ecludes instantiated head features from participating in the definition of semantic type and hence the CAI)) &amp;quot;~ It seems that the III&amp;quot;C nmst apply before che CAP lot the (Mini\[ion of semantic type, but after the CAP so that the CAI' instantiatlons of head features take ln,eeedence. Tbus, our earlier claim of strict ordering may be falsified by this case.</Paragraph>
      <Paragraph position="10"> Of com-se, the :~et of features neeessat T for type determination and the :act; instantiated by tile CAP may be disjoint. In this case, we can merely split the application of the IIFC in two, instart\[taring the flu'met' class beibre the CAP and tile latter class after the FFP ms originally described. Alternatively, it might be possible to notate head features on the head constituent rather than tim l)arent as is conventlally dtate. In this case, tile information needed by tile CAP is inherited, (tot instantiated, head feature wdues, atnl titus not subject to the ordering problem.</Paragraph>
      <Paragraph position="11"> On the other hand, if the sets are nondisjoint, this presents a problem not only for our algorithmic analysis, but for the detinltion of GI'SG given by Gazdar et al. Suppose that the IlFC determines types in such a way that the CAP is required to apply and instantiates head features thereby overriding the original values (since the CAP takes preeedence) attd changing the type determination so that the CAP does not apply. We wouhl thus require the CAP to apply if and only if it does not apply. This paradox :qtpears as an ordcring cych: in our algorithm; in the declarative detinition of Gazdar et al., it would be manifested in the inadmissability of all local set.~ of nodes 11\], at( equally unattractive effect. We leave the resolution of this problem open for the time being, merely noting that it is a di|fieulty for GPSG iu general, and not only for our characterization.</Paragraph>
      <Paragraph position="12"> l~For the technicM background of st\[d\[ a move, see the discussion of PATR semantics \[3\].</Paragraph>
      <Paragraph position="13"> ~uI am indt:bted to Roger Evens and William Keller for pointing this problem out, to me and for helpful discussion of solution alternatives.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML