XML Viewer - c67-1015

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/67/c67-1015_metho.xml
Size: 46,623 bytes
Last Modified: 2025-10-06 14:11:03
<?xml version="1.0" standalone="yes"?>
<Paper uid="C67-1015">
  <Title>SPGZ Axiom: S Rewriting Rules :</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
METHODS FOR OBTAINING CORRESPONDING PHRASE
STRUCTURE AND DEPENDENCY GRAMMARS
</SectionTitle>
    <Paragraph position="0"> ABSTRACT Two methods are given for converting grammars belonging to different systems. One converts a s~mple (context-free) phrase structure grammar (SPG) into a corresponding dependency grammar (DG); the other converts a DG into a corresponding SPG.</Paragraph>
    <Paragraph position="1"> The structures assigned to a string by a source grammar will correspond systematically, though asymmetrically, to those assigned by the target grammar resulting from its conversion. Since both systems are wealdy equivalent, generating exactly the CF languages, the methods facilitate experimentation with either notation in devising rules for any CF language or any CF set of strings designed to undergo subsequent transformation.</Paragraph>
    <Paragraph position="2"> A source SPG is assumed to be of'finite degree with ordered rules in which only the initial symbol is recursive. Unless the source grammars obey additional constraints, the target grammars may exhibit a peculiar property, defined as &amp;quot;structure sensitivity&amp;quot;. The linguistic implications of the property are discussed, and the linguistic motivation for imposing the constraints necessary to avoid its appearance is suggested.</Paragraph>
    <Paragraph position="3"> The author owes an especial debt to Jesse Wright of the Automata Theory and Computability Group in the Mathematical Science Department of IBM Research for many helpful discussior~s of theoretical problems arising in the course of this investigation.</Paragraph>
    <Paragraph position="4"> In an article on dependency theory i appearing in 1964 \[5\], Hays remarked, &amp;quot;Casual examination suggests there would be little difference between transformation of dependency trees and transformation of IC \[immediate constituent\] structures, but no definite investigation has been undertaken.&amp;quot; Since then, a dependency grammar with transformational rules has been designed for a subset of English sentences, and preliminary results indicate that Hays' observation is correct. \[9\] In either case, transformations are specified in terms of labeled trees, and the number of branches and the denotations of the labels do not affect the essential operation. Since the base grammar is generally limited to the generation of context-free pro-terminal strings of &amp;quot;deep structure&amp;quot;, and since context-free languages are characterizable in either dependency or phrase structure (i.e., immediate constituent} systems, neither system is clearly preferable as a base. A linguist may find that the notation afforded by one or the other is simpler for characterizing some language, or for defining structures to be transformed, or for adapting a grammar to computer applications. A linguist may also wish to experiment with grammars of both types, or redesign transformations defined on the structures of one base grammar in order to incorporate them into a transformational grammar using a different base.</Paragraph>
    <Paragraph position="5"> Such considerations as these motivate the present treatment of the problem of obtaining paired grammars by converting a grammar of one kind into a systematically corresponding grammar of the other which generates the same sentences and assigns comparable structures. In addition, conversion draws attention to some linguistically significant relationships that may exist unnoticed among the categories and rules of the source grammar and which may induce in the derived grammar a peculiar property of structure sensitivity, roughly analogous to context sensitivity.</Paragraph>
    <Paragraph position="6"> This property will be exhibited and discussed in the course of illustrating the method.</Paragraph>
    <Paragraph position="7"> We begin concretely by inspecting (Fig. i) a pair of grammars: SPG1, a simple or context-free phrase structure grammar of the kind formalized by Chomsky \[2\], Bar-Hillel \[i\] , and others, and DGI, a dependency grammar of the kind formalized by Gaifman{4\] and Hays \[5\] . Two structural diagrams, a P-tree and a D-tree drawn beneath the grammars, illustrate the structure each assigns.</Paragraph>
    <Paragraph position="8"> The rewriting rules of SPOt are of two kinds, those in which only categories appear, and those in which a category is re-written as a terminal (lexical formative or word). The latter may be separated from the former and made into assignment rules, i For additional material on dependency theory, see Ref. 6.</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
SPGI DGI
</SectionTitle>
    <Paragraph position="0"> Axiom: # S # Rewriting Rules : i. S ~ NP VP  2. VP --~ V NP 3. NP~D N * ~ the 5. D ~ some 6. N ~ boys 7. N --- girls 8. V --~ like 9. V ~ admire (ia) Axiom: * (V) Dependency Rule s : i. v (N * N) Z. N (D *) 3. D (,) Assignment Rules: i. D: the, some 2. N: boys, girls 3. V: like, admire  thereby increasing the resemblance of SPGi to DGi in an obvious way. This is possible because SPGt does not contain any mixed rewriting rules in which both categories and lexical formatives appear on the right. It can be shown that any SPG which has mixed rules, in this sense, can be converted into one which does not merely by introducing new categories, without affecting generative power. Thus there is no reason for not separating the two types of rules, and Chomsky~devotes a good part of Aspects of the Theory o_ff Syntax \[3\] to giving linguistic reasons for just such a separation. Hereafter, &amp;quot;rewriting rules&amp;quot; will refer only to those like rules i-3 in SPGi, and those like rules 4-9 Will be called &amp;quot;assignment rules&amp;quot;. Categories which do not appear on the left of any re-writing rules are terminal categories. With each category of SPGi, we associate a number called its degree. {To say that a category is of degree n means that n is the fixed upper limit to the number of nodes of the shortest path leading from it to a terminal category in any structure derived from St by successive rule applications. A category may be of infinite degree. For example, if X ~ XX, then X is of infinite degree and so is the grammar in which it occurs even though another rule rewrites X as a string containing only terminal categories.) i In SPGt the terminal categories D, N, V are of zero degree, since they are not rewritten; the categories NP and VP are of degree i, since at least one category in their replacement is of zero degree; and the degree of S is 2, since the least degree assigned to any category on the right of the rule for rewriting S is i. The maximum number assigned to any category in a grammar is also the degree of the grammar; therefore, the degree of SPGi is 2. An essential difference between SPGi and DGi now emerges more clearly. DGi uses only terminal categories, while SPGi uses categories of higher degree. The effects of the differences are reflected in the structure of the P-tree and D-tree. The latter has fewer branches, and this will always be the case for structures assigned to the same string by grammars belonging to the two different systems.</Paragraph>
    <Paragraph position="1"> Even so, there is a systematic correspondence between the two trees and their labels of a kind defined by Hays \[5\] . Every complete subtree of the D-tree, that is, every node taken together with all of its descendents, covers a substring of the sentence that is covered by a complete subtree of the P-tree. The conver, se does not hold, but every complete subtree of the P-tree covers a substring that is covered by a connected subtree of the D-tree. In i For a more precise characterization of &amp;quot;degree&amp;quot;, see Gaifman \[4\].</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 *
</SectionTitle>
    <Paragraph position="0"> the example, both complete subtrees dominated by N in the D-tree correspond to the two complete subt~ees of the P-tree dominated by NP, and the complete (sub)tree dominated by V corresponds to the complete (sub)tree dominated by S. However, the complete subtree dominated by Vlo in the lo-tree corresponds to an incomplete subtree of the D-tree, which is dominated by V but includes only V and the branches to its right; so that the relationship of correspondence is asymmetrical.</Paragraph>
    <Paragraph position="1"> While such systematic correspondences exist between the structures assigned to all strings generated by SloGi and DGi, this is not the general case. In general. ~ i. Any context-free language can be generated by grammars of either simple phrase structure or dependency systems.</Paragraph>
    <Paragraph position="2">  2. For any given SPG, there exists one or more DG over the same language all of whose structures correspond systematically to structures assigned to the same strings by the SPG if and only if the SPG is of finite degree.</Paragraph>
    <Paragraph position="3"> 3. For any given DG there exists one or more corresponding SloG. 4. For any given DG there exists a unique SPG of degree i that is  strongly equivalent to it.</Paragraph>
    <Paragraph position="4"> Gaifman \[4\] gives a very general method for constructing corresponding DG from any SloG of finite degree, and also a method for constructing a unique SloG of degree t from any DG. Here we give two methods for constructing corresponding Slog and DG that differ from Gaifman's. The first applies only to a more restricted set of SIdegG and leads to reduced DG, whereas Ga~fman's method tends to produce DG with overlapping categories and superfluous rules that may never be used to generate any string. The second allows construction of SPG of degree greater than i from certain DG. 2 A simplified sketch of the two methods follows. Each re-writing rule is &amp;quot;augmented&amp;quot; by starring a category on the right. Each dependency rule is augmented by assigning a numerical coefficient to each dependent. Figure 2 shows a possible augmented lloroofs are given by Gaifman \[4\]. It is assumed that the Slog is non-erasing and reduced; that is, no category is rewritten as null, and there are no superfluous categories or rules.</Paragraph>
    <Paragraph position="5"> 2The second method was suggested by Kay's procedurefor constructing lo-trees from D-trees \[7\]. More precisely, an Slog of degree n &gt; i may be derived if, in the DG, some categories govern two or more dependents, and the left- or rightmost depen= dent itself governs dependents.</Paragraph>
    <Paragraph position="6"> form of SPGI on the left and a possible augmented form of DGI on the right. (Detailed consideration of the problem of augmentation @ill follow the sketch of the operations.) These augmentations were deliberately chosen so that conversion of either grammar uniquely produces the other. Different augmentations produce different results, although structures of the original grammar and of the grammar derived from it still correspond.</Paragraph>
    <Paragraph position="7">  The columns TS and IRL in Figure 2 represent a Table of Substitutes and an Intermediate Rule List. In the conversion of SPGi, the TS is constructed, equating each category on the left of a rule with a superscripted terminal category. The numerical superscript {hereafter called the exponent) equals the number of rules traced through when tracing by starred categories before a starred terminal category is reached, and expresses the distance between the terminal category and the category for which it is a substitute. In the IRL, the categories occurring in each rule of the SPG are replaced by their substitutes from TS. Taking the first LRL rule, construction of a dependency rule is begun by replacing the arrow with parentheses enclosing the substitute categories on the right. Thus from the first IRL rule in Figure 2 we obtain V 2 (Ni V~). The construction of a dependency rule is continued so long as the exponent of the starred category X~ is greater than zero. The next step is to search the IRL for a rule with X s on the left. The categories on the right of the new rule are inserted in the parentheses of the D rule under construction, in the position of X~, but no new parentheses are added. When the starred occurrence is X~, it is replaced by *, all exponents are erased, and the dependency rule is complete. The process is repeated for new dependency rules until all IRL rules are exhausted.</Paragraph>
    <Paragraph position="8"> We must add rules to the constructed DG for any unstarred terminal category of the IRL. The added rules assign no dependents and are of the form D(*). The assignment rules are simply transferred, and since V is the substitute for the axiom category S, it is taken as the axiom for the DG.</Paragraph>
    <Paragraph position="9"> Going in the other direction from an augmented DGI to SI:'Gt, we first assign an exponent s to each category, where s equals the largest coefficient of any dependent of that category. If a category is not assigned dependents, its exponent is zero. The first rule of augmented DGt now appears as V z (ZN 1 * 1Nt).</Paragraph>
    <Paragraph position="10"> From this rule two rules are constructed for the IRL; that is, the number of IRL rules constructed from each rule of DGt will be equal to the exponent of the governor. The first rule so obtained writes the governor with its exponent on the left of the arrow. All of its dependents whose coefficients equal that of the exponent are written in order on the right, and the governing category with its exponent decreased by t is written with the *. Thus we obtain V z ~ N 1 Vt.. The second rule is obtained in a similar fashion, with the exponent of the governing category diminished by 1 yielding V t -~ V0 N1. ' The process is repeated until an IRL rule re-writing some category with exponent equal to 1 is used, after which the next DG rule is processed, and so on until all DG rules are exhausted.</Paragraph>
    <Paragraph position="11"> At this point, the *'s may be erased and, except for category labels, the IRL is exactly equivalent to the rewriting rules of the original unaugmented SPG. The only function the TS serves is to reassign labels. Assignment rules are transferred and the substitute for the axiom of DGt is added, t SPGI and DG1 are very simple, with no embeddings and no optional rules. More complicated grammars give rise to problems of augmentation, especially for SPG. Even SPGI poses a problem.</Paragraph>
    <Paragraph position="12"> Assume it had been augmented by starring NP in rule I and in rule Z. In that case, the same substitute, N Z, is assigned to both S and VP, and the procedure produces a DG that is not even weakly equivalent to the original. Clearly some constraints must be imposed on augmentation and provision made for grammars in which it may not be possible to avoid starring a category more than once. Similarly, assume that some DG has a rule of the form</Paragraph>
    <Paragraph position="14"> But now the rules generate the sequence \]3 A C D, which was not generated by the original rule, and do not generate A B C D, which was. This is remedied by requiring that if a coefficient n has been assigned to a dependent, no higher number is assigned to any dependent which intervenes between it and the *. We will also require that at least one dependent be preceded by t, and if any dependent is preceded by n &gt; t, there must be at least one dependent 1Although grammars with only one axiom are illustrated, grammars with more than one axiom can obviously be handled as well.</Paragraph>
    <Paragraph position="15"> preceded by n - i. This is not crucial, but it avoids setting up unnecessary, single-branching categories in the derived $PG. Note that if all dependents in any rule are preceded by i, which is the same as not augmenting the DG at all, the resulting SPG will be of degree i; that is, each rule of the grammar will contain at least one terminal category on the right. This is equivalent to Gaifman's procedure \[4\].</Paragraph>
    <Paragraph position="16"> Augmentation of SPG is the more difficult case. Primarily it is the problem of constructing the TS in a finite number of steps. For ex~mple, if for S --- NP VP S the S is starred on the right, an infinite loop is created immediately. This is easy to avoid when considering a single rule or a small set of rules, but we do not know in general whether some series of rule augmentations may not lead to the same situation. Gaifman's solution is to require that the marked category be of lesser degree than the category on the left, but this not only leads to a proliferation of categories in the derived DG whenever the SPG has more than one re-writing rule for any of its categories, it prevents us from deriving the simplest corresponding DG in some cases. I On the other hand, the method employed here will not work unless some restrictions are imposed on augmentation which also imply restrictions on the form of the SPG in addition to the requirement of finite degree. It is not clear what restrictions are minimally necessary, but it is sufficient to require, in addition to finiteness of degree, that the rules of the SPG be ordered, so that in a developing derivation, if rule n has been applied, no rule m, m &lt; n, need be applied thereafter. This is too restrictive, and does not allow for full recursion. We may, however, allow a dummy symbol, S', to appear in any rule rewriting some X as a string containing some Y, S' ~ Y ~ X, where S' is replaced after one application of the set of rules by # S #, the axiom, so that the rules then reapply in linear order.</Paragraph>
    <Paragraph position="17"> Any $PG of finite degree with ordered rules providing for recursive application in the manner specified can be converted to a corresponding DG by the method given here. The base component proposed by 2Chomsky \[3\] for transformational grammar is an SPG of this form.</Paragraph>
    <Paragraph position="18"> l&amp;quot;Simplest&amp;quot; with respect to process, number of categories, and freedom from the property of structure sensitivity. See p. 14 ff.</Paragraph>
    <Paragraph position="19"> 2Chomsky does not explicitly require that if S' appears in rewriting X, some Y (S' ~ Y ~ X) appear in the rule also, but implicitly observes the restriction. A new grammar for English by Rosenbaum \[10\] contains a rule NP ~ NP S' which does not observe it. This is the only case known to me of an actual grammar for some !'real&amp;quot; language that violates this restriction, and Rosenbaum does not claim any strong theoretical motivation for the rule in-question. Cf. also Lees \[8\], and the MITRE grammar \[11\].</Paragraph>
    <Paragraph position="20"> We turn now to more detailed consideration of augmenting and converting SPG subject to the above constraints. While the constraints insure that the SPG can be workably converted, some linguistically significant problems arise if we consider how to derive a simple DG, and it will be shown that under some conditions the derived DG will exhibit a feature not heretofore considered in the literature, a feature of &amp;quot;structure sensitivity&amp;quot;.</Paragraph>
    <Paragraph position="21"> Method i. Conversion of SI:'G to DG Step i. Augmentation All rules of the SPG which rewrite a given category are conflated and written as one schematized rule, with square brackets enclosing optionally omitted categories and braces enclosing lists of categories from which a single choice is made. t Thus Z--, \[W\] {X} conflates four rules, andwill, in any given application, rewrite Z as either WX or WY or Xor Y.</Paragraph>
    <Paragraph position="22"> Beginning with the first rule rewriting some X and proceeding in rule order, star occurrences of categories, excluding the dummy symbol S I, on the right in such a way that one and only one starred category Y% where Y ~ X, will occur in any application of the rule. 2 It follows that no bracketed (omissible) occurrence may be starred. If more than one category is starred, the schernatized rule must be separated into as many rt, les as there are starred categories, with one starred category appearing in each. For example,</Paragraph>
    <Paragraph position="24"> linguistic reasons all ways of rewriting a category will usually have some element in common and some c0nflation will usually be preserved.</Paragraph>
    <Paragraph position="25"> iSquare brackets are used here rather than the customary parentheses to avoid confusion with the parentheses used in the derived DG. More exactly, \[\] and {} enclose lists of strings of categories. In the case of \[ \] the list may consist of one member, and in both cases the strings may consist of one item.</Paragraph>
    <Paragraph position="26"> ZSince the SPG must be finite, no rule will rewrite X as a string orals in any application, and some other category of degree less th~n X will be available for starring in all cases.</Paragraph>
    <Paragraph position="27"> If the simplest DG corresponding to the SPG is desired, then it is preferable to augment the SPG in such a way that a) no category occurs both starred and non-starred on the right, and b) no category is starred on the right of two or more rules which rewrite different categories.</Paragraph>
    <Paragraph position="28"> But it is not always possible to observe these policies, and in that case, additional augmentation is necessary in order to distinguish among occurrences of X as the marked constituent in the rule rewriting Y, X as the marked constituent in the rule rewriting Z, Z ~ Y, and X as an unmarked constituent of any category.</Paragraph>
    <Paragraph position="29"> Assume some category X is starred in two or more rules which rewrite different categories, 1 say Y and Z, and also occurs unstarred on the right in some rules. The two X*'s are assigned different subscripts to avoid assigning the same substitutes for Y and Z with the consequent loss of essential information. We now have three varieties of X, namely: X, XI, and X2, where X is the unstarred variety, X I is the marked constituent of Y, and X 2 of Z. (If X does not occur unstarred, only X and X I are needed.) If X is not a terminal category, there is a rule rewriting X. Then we must add, beneath that rule, rules for rewriting X 1 and X 2. If</Paragraph>
    <Paragraph position="31"> writing U I and U 2 beneath the rule rewriting U. Thus the process of adding rules is iterative, but it will eventually end when, in some lower rule, a terminal category is starred.</Paragraph>
    <Paragraph position="32"> Note that in some cases sub-subscripts are needed. For  example : I. Z .... PI*&amp;quot; &amp;quot; &amp;quot; 2. X .... RI*...</Paragraph>
    <Paragraph position="33"> 3. Y .... R2*...</Paragraph>
    <Paragraph position="34"> 4. R .... P2*&amp;quot; &amp;quot; &amp;quot; 5. P .... A*...</Paragraph>
    <Paragraph position="36"> more than once in rules rewriting the same cate-</Paragraph>
    <Paragraph position="38"> , which produces three rules for Y, in two of which X* occurs.</Paragraph>
    <Paragraph position="39"> 1&amp;quot;roceeding down the rules, we see first that pt~ and OCCURS, scanning down the left,~that P's are not terminal categories and that there is no rule rewriting 1&amp;quot;1&amp;quot; Therefore, beneath 1&amp;quot; .... A*...</Paragraph>
    <Paragraph position="40"> we add Pt .... At* ....</Paragraph>
    <Paragraph position="41"> In the second rule, RI* occurs, is non-terminal, and requires  1. Z .... 1&amp;quot;1&amp;quot;&amp;quot; &amp;quot;&amp;quot; 2. X .... Rt*...</Paragraph>
    <Paragraph position="42"> 3. Y .... RZ*...</Paragraph>
    <Paragraph position="43"> 4. R .... P2*&amp;quot; &amp;quot; &amp;quot; 5. R 1 *... .... 1&amp;quot;21 6. R 2 #;... .... P22 7. 1&amp;quot; .... A*...</Paragraph>
    <Paragraph position="44"> 8. Pl .... AI*&amp;quot; &amp;quot; &amp;quot; 9. 1&amp;quot;2 .... A2* ....</Paragraph>
    <Paragraph position="45">  But in rule 7, A#; requires no additions, because A is not subscripted, and the subscripted A's in rules 8-1t require no additions because A's are terminal categories. As a result, no Xi* occurs on the right of two or more rules which rewrite different categories. This process and the remaining steps will be illustrated using SPG2. In order to show the effects of different choices</Paragraph>
    <Paragraph position="47"> :ii in ~ugmentation, SPG2 will be augmented in a way that deliberately violates the policies advocated for starring (but not the restrictions).</Paragraph>
    <Paragraph position="48">  We will assume assignments are the same as for SPGt with the additional assignments of &amp;quot;send&amp;quot; and &amp;quot;give&amp;quot; to V, and &amp;quot;books&amp;quot; and &amp;quot;flowers&amp;quot; to N. I Note that, in this augmentation, NP is starred twice and also occurs non-starred on the right. Subscripting and duplication are required, resulting in  i. S --- NP~ VP z. vP-v \[NP\] 3. NP ~ D N~ 4. NP 1 ~ D N~ 5. NP 2 ~ D N~  Step 2. Establish a table of substitutes (TS) of 2 columns and n rows, where n equals the number of rules in the SPG (after step i). List categories on the left in column t, and starred categories on the right in column 2, in order. Eliminate any duplicate rows. (Cf. p. 9, footnote l.) At the end of step 2, applied to SPGZ, the result is  complex form than that given here would presumably block the generation of un-English sentences.</Paragraph>
    <Paragraph position="49"> li Step 3. Starting with k = i, try to match the category in row k, column 2 (k, 2) with a category occurring in column i of a lower row. If a match is found on row m, check to see if a match also occurs on m + i. {This will be the case if the SPG contains more than one rewriting rule for the category in' (k, 2}.) If it does, mark m as a branching point, insert a duplicate of row k beneath row k {the duplicate will be row k + i) and follow separate branches for substitutes for (k, i) and (k + i, i). Replace the category in (k, 2) with the category  in (m, 2) and repeat the search on the remaining lower rows. When the search is exhausted, assign an exponent s to the last category obtained {the final substitute} in column 2, where s equals the number of matches plus i. Increment k and repeat until every category in column i has a substitute that does not appear in column i. At the end of step 3, we obtain a unique substitute for every rewritten category of SPG2. i  each category in the augmented SPG rules with its substitutes from TS. If a category has no substitute, superscript it with a zero. If X i has more than one substitute, include them in braces wherever X i appears on the right. If a category X occurring on the left has more than one substitute, provide a separate rewriting rule for each substitute yS such that yn._~..._nS ys-i~... .2 If a non-starred category on the right iIn cases, not illustrated here, where several substitutes are found because several rules rewrite some Xi, each substitute will be uniquely assigned to X i.</Paragraph>
    <Paragraph position="51"> has more than one substitute, include all substitutes as braced options. At the end of step 4, we obtain</Paragraph>
    <Paragraph position="53"> Step 5. Take the first unmarked IRL rule, which rewrites some X~, set s as a counter, and write X n followed by a pair of parentheses enclosing the string of categories on the r~ght of the IRL rule. Note that the string will contain an X s- *. n Step 6. Mark the previously processed IRL rule, decrease s by t, and test for s = 0. If so, go to step 7. Otherwise, find s the rule which rewrites the new X n. Replaced the starred xS* in the current D rule with the categories on the right of  the IRL rule. Repeat step 6.</Paragraph>
    <Paragraph position="54"> Step 7. Test to see if any unmarked rules remain in the IRL. If they do, return to step 5. If not, a. Erase starred categories, leaving only the star.</Paragraph>
    <Paragraph position="55"> b. Add rules of the form X(*) for any non-starred terminal , category of the IRL.</Paragraph>
    <Paragraph position="56"> c. Add as axiom(s) of the DG the substitute(s) of the axiom(s) of the SPG.</Paragraph>
    <Paragraph position="57"> d. Add the assignment rules of the SPG. If a category is subscripted in the DG, the assignments are duplicated for each subscripted variant.</Paragraph>
    <Paragraph position="58"> e. Erase exponents.</Paragraph>
    <Paragraph position="59">  If we interpret each distinct X n as a separate category, then the same list of words is assigned to the three categories, N, Nf, and NZ, by the assignment rules. If we interpret them as the same category, then the subscripts distinguish the different substructures of a sentence in which the N's occur. Each N governs a D directly on the left, but if it is the axiom it is required to govern another N on the right also. If N is not the axiom but is governed directly by the axiom, it ~ govern another N on the right, and is required to govern a V on the left. If it is neither the axiom nor a direct dependent of the axiom, it governs nothing but the D. We may not erase the subscripts and write a single conflated rule N (iv\] D * iN\]), for then strings not generated by SPGZ would be generated. (For example, the rule would generate an infinite set of strings, (D N)n.) In such circumstances we say that the DG is structure sensitive.</Paragraph>
    <Paragraph position="60"> Definition i. A DG is structure sensitive if a. the set of terminals assigned to one category is identical to the set assigned to any other category, and/or b. any rule restricts the choice of dependents a category may govern to a subset of the ordered dependents it is permitted to govern in some other rule.</Paragraph>
    <Paragraph position="61"> Note that a DG containing the rules A (*) and A (B * C) is structure sensitive by this definition. Here, too, conflation is impossible, since A (\[B\] * \[C\]) allows A to govern B without governing C.</Paragraph>
    <Paragraph position="62"> This structure-sensitive feature of some DG apparently serves functions llke those served by the context-sensitive feature of context-sensitive phrase structure grammars in placing restrictions on the string generated, but there seems to be no mention of it in the literature. Its character may be masked by the freedom to set up categories and assign the same terminals to them. i For example, one may substitute simple symbols X and Y for the complex symbols N i and N 2, and obtain the rules</Paragraph>
    <Paragraph position="64"> iGaifrnan's method \[4\] for converting SPG to DG makes great use of this freedam.</Paragraph>
    <Paragraph position="65">  In this case, only the assignment rules, assigning exactly the same set of terminals to X, Y, and N, explicitly show the structure sensitivity,' although the fact that N governs a subset of the dependents of X and Y is significant.</Paragraph>
    <Paragraph position="66"> Such arbitrariness in assigning symbols raises the significant linguistic problem of criteria for establi.shing categories, which is too large an issue to be discussed here, The p.roblem is no less relevant to SPG, Definition Z. An SPG is structure sensitive if a. the set of terminals assigned to one terminal category is * identical to the set assigned to any other, and/or b. any rewriting rule does not contain, on the right, a unique category (other than the axiom) which occurs only once in that rule and appears on the right in no other rule.</Paragraph>
    <Paragraph position="67"> The linguistic implica.tions of the property may be clarified by considering two sets of rules, one for the artificial language a n b a n and one for a fragment of English.</Paragraph>
    <Paragraph position="68"> The language a n b a n is generated by the rules S ~-~ A S A, S ~ B, A ~ a, and B ~ b. The first rule &amp;quot;does not contain a unique category (other than the axiom) which occurs only once&amp;quot;, since A occurs twice. One of the A's must be starred in converting to a DG and the DG will distinguish two categories of a, a left A and a right A. Note that the same language is generated if the first rule is S ~ A S and a transformational rule X B =&gt; X B X is added. In this case, each rewriting rule contains a unique category other than the axiom, and a structure-free set of dependency rules is obtainable from them.</Paragraph>
    <Paragraph position="69"> A less artificial but still extreme case of structure sensitivity is that in which two or more rules are rewritten in exactly the same way. Assume that an SPG for English has the following rewriting rules :</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="17" type="metho">
    <SectionTitle>
S~X VP
VP~V Y
X--~D N
Y~D N
</SectionTitle>
    <Paragraph position="0"> Here the distinction drawn between a sequence D N that is a(n) X (i. e., derives from X) and a D N that is a Y, reflects the functional notions of subject and object, but obscures the categorial notion is a noun phrase. Chomsky \[3, pp. 68-72\] argues that it is confusing and redundant to assign categorial status to both notions since the purely relational character of the functional notion is implicit in the rewriting rules S --~ NP VP and VP ~ V \[NP\]; that is, the notion &amp;quot;subject of a sentence&amp;quot; refers to the NP under the immediate domination of S, and &amp;quot;object of a verb&amp;quot; refers to the NP under the immediate domination of VP. Chomsky also shows that  for sentences like &amp;quot;John was persuaded by Bill to leave&amp;quot; where &amp;quot;John&amp;quot; is simultaneously object-of &amp;quot;persuade&amp;quot; and subject-of a transformed embedded sentence &amp;quot;John leave&amp;quot;, it is impossible to represent such functional notions by categorial assignments, and adds that &amp;quot;Examples of this sort, of course, provide the primary motivation and empirical justification for the theory of transformational grammar.&amp;quot; \[p. 70\].</Paragraph>
    <Paragraph position="1"> Whether it is possible or desirable to require that SPG components of transformational grammars for natural languages be structure free is an open question, ibresumably, it is desirable if it is possible, since the least powerful, most restricted grammar -- the tightest fit -- is to be preferred. Moreover, inspection of proposed grammars, for English at least, indicates that most of their rules do contain unique categories on the right.</Paragraph>
    <Paragraph position="2"> Returning now to SPG2, we see that it is structure free, since every elementary rewriting rule for every category X i contains a single occurrence of at least one category Yi on the right which does not appear elsewhere on the right, and is not the axiom.</Paragraph>
    <Paragraph position="3"> Under these conditions we will say that Yi is a head for category X i and call all such Y's head categories. Examination shows that the structure-sensitive property of DG2 arose from the choices made in augmenting SPG2. If only head categories are marked, a structure-free DG similar to DGi results, in which all signs of augmentation can be erased without altering its generative capacity.</Paragraph>
    <Paragraph position="4"> Intuitively, it seems reasonable to regard heads as sources of a &amp;quot;governor&amp;quot; in any string derivable from the category in whose rewriting rules they appear. This does not mean that the string is to be considered endocentric in the strong sense of requiring that the governor be substitutable for the entire string without loss of grammaticality, and the objection sometimes raised that dependency theory forces a purely endocentric analysis of a language is based on failure to distinguish between &amp;quot;head o.P' and &amp;quot;substitutable for&amp;quot;. It appears truer to say that dependency analysis assumes that one phrase type is distinguished from another primarily by the singular presence of some category in it rather than by co-occurrence and order of categories in it.</Paragraph>
    <Paragraph position="5"> Incidentally, aside from the problem of obtaining a structure-free DG from an SPG, avoidance of structure sensitivity may be a criterion for assigning government when one is analyzing a language in terms of dependency theory. In English, for example, the choice has generally wavered between noun and verb as candidates for sentence government. Since every elementary sentence contains one verb but may contain several nouns, choosing the noun forces a structure-sensitive DG.</Paragraph>
    <Paragraph position="6"> In converting a DG to an SPG, no requirement of intrinsic  ordering needs to be imposed on the dependency rules, as it was on the rewriting rules of SPG. Dependency rules may always he partially ordered by starting with the rule (or rules) for the axiom(s). Call these &amp;quot;level zero rules&amp;quot;. Then level i rules are those which assign dependents to the axiom, level Z those which assign dependents to categories which make their first appearance in level t rules, and so on. To insure eventual termination, however, it is required that if X occurs anywhere in a level n rule, and is a dependent in a level m rule, m &gt; n, then its choice as * dependent in the level m rule is optional or else the governor in the level m rule is optionally chosen at some point. Otherwise no constraints on recursion are necessary, and any category may be i'eintroduced at a lower level.</Paragraph>
    <Paragraph position="7"> By contrast, the problem of conflation arises. It has been shown that rules like k (~) and A (B * C), occurring in structure-sensitive DG, cannot be conflated. Conflations of A (B $) with A (C * D) and of A (~ B) with A (B *) are also impossible, although structure sensitivity as defined above is not involved. If the rules are not conflatable because the DG is structure sensitive, then the SPG may also be structure sensitive. If they are not conflatable solely because of disparate number or position Cleft or right) with respect to the governor, the SPGwill be structure free.</Paragraph>
    <Paragraph position="8"> In either case, the conversion process becomes somewhat more complicated.</Paragraph>
    <Paragraph position="9"> The process will be illustrated first by applying it to a DG whose rules cannot be fully conflated both because some are structure sensitive and because some assign disparate numbers or positions to dependents of a category.</Paragraph>
    <Paragraph position="10"> Method 2. Conversion of DG to SPG Step t. Augmentation Definition: A dependent element in a dependency rule is any braced or bracketed string not included in larger braces or brackets, ~nd any single unbraced, unbracketed occurrence other than * occurring within parentheses.</Paragraph>
    <Paragraph position="11"> E.g., in A ({BD\[C\]} * E \[F\]) there are three dependent elements: &amp;quot;{BD\[C\]'~ , E, and \[F\].</Paragraph>
    <Paragraph position="12"> a. Precede each dependent element with a coefficient n &gt; i, so that for any n &gt; i, at least one element is preceded by n - I, and for any m &gt; n, m does not intervene between n and ~. Thus:  Assign an exponent s to the governor in each rule, where s equals the largest coefficient n of any dependeSt. (For rules of the form X (~), s is zero.) If a category X is assigned different exponents for different rules in which it occurs as governor, associate a distinct subscript with each distinct exponent.</Paragraph>
    <Paragraph position="13"> Replace every occurrence of X as dependent by X s. If X has been subscripted, include each variant in braces, thus : j Step 2. Test for unmarked (unprocessed) rules. If none remain, go to step 4. Otherwise take the first unmarked rule, where some x~i is governor and set a counter S = s.. 1 Step 3. If S = 0, mark the rule and repeat step 2. If S ~ 0, construct'a rule of an IntErmediate Rule List of the form X~ ..... On the right of the arrow duplicate all dependent elements whose coefficients n = S. Include the $, and precede it with ~-i. Decrease S by i, and repeat step 3.</Paragraph>
    <Paragraph position="14"> Step 4. Establish a Table of Substitutes, assigning a unique symbol to every distinct XSn i for which S i ~ 0.</Paragraph>
    <Paragraph position="15"> Step 5. Assign the substitute or substitutes for the axioms of the DG as axioms of the SPG.</Paragraph>
    <Paragraph position="16"> Step 6. Rewrite the IRL, using substitutes from the TS, deleting exponents and subscripts from any X 0 and omitting ~. n' Step 7. Transfer the assignment rules.</Paragraph>
    <Paragraph position="17"> This method will be illustrated using an abstract DG3, without assignment rules. DG3 is structure sensitive since there are rules which restrict the choices of dependents of categories A and C to subsets of the ordered categories they are permitted to govern in other rules.</Paragraph>
    <Paragraph position="18">  i (iG 0 , iFi) 5. C i 0 6. c z (*) 7. D O (*) 8. E i (* IG 0) iB 1 , 3E 1)  iNote the impossibility of further conflation, and that every category'occurring as dependent in a rule of level m and also occurring in some other rule of level n &lt; m is always optionally chosen as a dependent. That is, there is t-he possibility of avoiding its reintroduction. Thus A, the axiom, is reintroduced in rule 9, but its governor, F, is optionally chosen as a dependent. Otherwise no derivation could terminate.</Paragraph>
    <Paragraph position="19">  SPG3 is structure sensitive, since two categories, S and X, are rewritten the same way. This results from the fact that, in DG3, A has two sets of dependents and one set is included in the other. The structure-sensitive rules for C in DG3 produced no additional structure-sensitive rules in SPG3, however, since one of them, C ($1, was not processed in step 3 and did not form part of the IRL.</Paragraph>
    <Paragraph position="20"> SPG3 may be rewritten as a structure-free grammar in a purely ad hoc way by eliminating the rewriting rule for X and shbstituting S for the occurrence of X on the right. More generally, it is reasonable to require of the original DG that its rules be designed so that it is possible to write a single rule (schema} assigning dependents to any given category. This requirement is reasonable if the DG is a base component of a transformational grammar whose transformations take care of the eventual order of elements in a sentence. The primary function of the DO's dependency rules is, in this case, only that of listing 'co-occurring categories in the dependency relations in some canonical order. In DO3, A always occurs with B as dependent. Whenever E is a dependent of A, then either a second B or a C are also dependents, and if a second B is a dependent of A, D is also. This set of conditions is summed up by 2i Similarly, B always occurs with E or F as dependents, i.e., and C occurs with no dependents or else with both G and F, i.e., c (* \[O F\]).</Paragraph>
    <Paragraph position="21"> These rules express co-occurrence relationships more directly than the original rules do. Let us assume this constraint and redesign DG3 as DG4. Let the augmented DG4 be:</Paragraph>
  </Section>
  <Section position="5" start_page="17" end_page="23" type="metho">
    <SectionTitle>
5~ V--E G
6. U--Z F
</SectionTitle>
    <Paragraph position="0"> v\] SPG4 is structure free. Furthermore, it has only one axiom, whereas SPG3 had two even though derived from a DG which had only one.</Paragraph>
    <Paragraph position="1"> The additional restrictions proposed for DG and SPG in discussions of the results of conversion by these methods are not crucial as far as obtaining systematically corresponding grammars is concerned. Without them, every complete subtree of a D-tree will correspond to a complete subtree of a P-tree over the same string, and every complete subtree of the P-tree will correspond to a connected subtree of a D-tree over the same string. (No formal proof of this was given, but the methods of constructing an IRL make it moderately apparent.) Hays \[5\] suggests the term relational correspondence for this state of affairs. Also there is a systematic relationship between the categories of the two grammars~ which the TS makes explicit.</Paragraph>
    <Paragraph position="2"> Sometimes that relationship is simple, as in the case of DG4 and SPG4. Given any category of DG4, there is exactly one category of SPG4 from which the same set of strings is derivable. The relationship also holds between the categories of DGt and SPGt, and between those of DG2 and SPG2. Under these conditions, Hays \[5\] calls the categories &amp;quot;substantively equivalent&amp;quot;. The relationship between the categories of DG3 and SPG3 is less simple. There the set of strings derivable from A of the DG is the union of the set of strings derivable from S and Z of the SPG, and the set derivable fromB is the union d the set derivable from W and V.</Paragraph>
    <Paragraph position="3"> Hays says that a D-tree and a P-tree correspond if they correspond relationally and if the category at the origin of every complete subtree of the D-tree is substantively equivalent to the category labeling the complete subtree of the P-tree related to it. If a DG and an SPG &amp;quot;have the same terminal alphabet, and for every string over that alphabet, every structure attributed by either cot/responds to a structure attributed by the other&amp;quot;, he calls the two grammars &amp;quot;strongly equivalent&amp;quot;. \[5, p. 52t\] We prefer to say that they correspond substantively, since relational correspondence is asymmetric and there are always &amp;quot;left-over&amp;quot; SPG  categories to which no DG category is substantively equivalent.</Paragraph>
    <Paragraph position="4"> The weaker relationship exhibited by SPG3 and DG3, where some DG categories are not substantively equivalent to any single SPG category, we have been calling systematic correspondence.</Paragraph>
    <Paragraph position="5"> i Conversion by the methods given here results in systematic' correspondence. If the suggested constraints, which appear to be linguistically well-mot~vated for base components of transformational grammars, are imposed on the form of the source grammar, the target grammar corresponds substantively as well as systematically to the source grammar and both are structure free. deg..</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML