File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/93/e93-1029_concl.xml

Size: 18,996 bytes

Last Modified: 2025-10-06 13:56:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="E93-1029">
  <Title>Mathematical Aspects of Command Relations</Title>
  <Section position="5" start_page="245" end_page="248" type="concl">
    <SectionTitle>
4 Implementations
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="245" end_page="245" type="sub_section">
      <SectionTitle>
4.1 Problems of Implementations
</SectionTitle>
      <Paragraph position="0"> The aim set by our theory is to reduce all possible nearness conditions of grammar to some restrictions involving command relations. Thus we treat not only binding theory or case theory but also restrictions on movement. Even though \[Barker and Pullum, 1990\] did not think of movement and subjacency as providing cases for command relations, the fact that nearness conditions are involved clearly indicates that the theory should have something to say about them. However, there are various obstacles to a direct implementation.</Paragraph>
      <Paragraph position="1"> The theory of command relations is not directly compatible with standard nearness relations in G8.</Paragraph>
      <Paragraph position="2"> A command relation as defined here depends in its size only of the isomorphism type of the linear structure above the node z. So, typical definitions such as those involving the notions of being governed, being bound, having an accessible subject fail to be of the kind proposed here because they involve a node that stands in relation of c-command rather than domination. Nevertheless, if 6B would be spelt out fully into a boolean grammar, far more labels have to be used than appear usually on trees displayed in GB books. The reason is that while context-free grammars by definition allow no context to rule the structure of a local tree, in GB the whole tree is implicitly treated as a context. But if it is true that the context for a node reduces to nodes that are ccommanding, it is enough to add for certain primitive labels X another label QX which translates as one of my daughters is X. Here, QX is not necessarily understood to be a new label but a specific label that guarantees one of the daughters to be of category X. However, 'modals' such as Q are somewhat whimsical creatures. Sometimes, QX is an already existing category, for example Q|P can (with the exception of exceptional case marking constructions) he equated with C'. On other occasions, however, we need to incorporate them into our grammar; prominent modals are SLASH : X, which has the meaning somewhere below me is a gap of category X and AGR : X which says this sentence has a subject of category X. If a context-free rendering of phrase structure is done properly (as for example in \[Gazdar et aL, 1985\]) a single entry such as V must be split into a vast number of different symbols so we can reasonably assume that our grammar is rich enough to have all the QX for the X we need; otherwise they must be added artificially. In that case many of the standard nearness relations can be directly encoded using command relations.</Paragraph>
      <Paragraph position="3"> A second problem concerns the role of adjunction in the definition of subjacency. If the domain of movement for a node (that is, the domain within which the antecedent has to be found) is tight, then no iteration of movement leads to escaping the original domain. So, the domain for movement must be large. But it cannot be too large either because we loose the necessity of free escape hatches (spec of comp, for example). The typical definitions of subjacency lead to domains that are just about right in size. However, the dilemma must be solved that after moving to spec of comp, an element can move higher than it could from its original position. Different solutions have been offered. The most simple is standard 2-node subjacency which is KOMMAND o KOMMAND. This domain indeed allows this type of cyclical movement; cyclic movement from spec of comp to spec of comp is possible - but only to the next spec of comp. However, due to it's shortcomings, this notion has been criticised; moreover, it has been felt that 1-node subjacacency should be superior, largely because of the slogan 'grammar does not count'. Yet, tight domains don't do the jobs and so tricks have been invented. \[Chomsky, 1986\] formulated rather small domains but included a mechanism to escape them by creating 'grey zones' in which elements are neither properly dominated by a node nor in fact properly non-dominated. This idea has caught on (for example in \[Sternefeld, 1991\]) but has to be treated cautiously as even the simplest notions such as category, node etc. receive new interpretations because nodes are not necessarily identical with occurrences of categories as before. A reduction to standard notions should certainly be possible and desired - without necessarily banning adjunction.</Paragraph>
    </Section>
    <Section position="2" start_page="245" end_page="246" type="sub_section">
      <SectionTitle>
4.2 The Koster Matrix
</SectionTitle>
      <Paragraph position="0"> As \[Koster, 1986\] observed, grammatical relations are typically relations between a dependent element and an antecedent or:</Paragraph>
      <Paragraph position="2"> \[Koster, 1986\] notes four conditions on such configurations. null a. obligatoriness  b. uniqueness of the antecedent c. c-command of the antecedent d. locality If these conditions are met then this relation has the effect share property This has to be understood as follows. (a.) and (b.) express nothing but that 6 needs one and only one antecedent. This antecedent, a, must c-command 6. Finally, (d.) states that a must be found in some local domain of 6. Of course, this domain is language specific as well as specific to the syntactic construction, i. e. the category of 6 and c~. Likewise, the property to be shared depends on the category of a and 6.</Paragraph>
      <Paragraph position="3"> The locality restriction expresses that a is found within the R-domain of 6. This relation R is in the unmarked case defined as follows.</Paragraph>
      <Paragraph position="4"> Definition 15 a is locally accessible I to 6 if c~ &lt;_ 1~, where fl is the least maximal projection containing 6 and a governor of 6.</Paragraph>
      <Paragraph position="5"> \[Koster, 1986\] assumes that greater domains are formed by licensed extensions. These extensions are marked constructions; while all languages agree on the local accessibility 1 as the minimal domain within which antecedents must be found, larger domains may also exist but their size is language and construction specific. Nevertheless, the variation is limited. There are only three basic types, namely locally accessible i for i = 1, 2, 3.</Paragraph>
      <Paragraph position="6"> Definition 16 a is locally accessible 2 to 6 if ot &lt;_ ~, where 1~ is the least maximal projection containing 6, a governor for 6 and some opacity element w. a is locally accessible z to &amp; if there is a sequence ~i, 1 &lt; n, such that \[31 is locally accessible 2 from &amp; and ~i+1 is locally accessible 2 from ~i. The opacity elements are drawn from a rather limited list. Such elements are tense, mood etc. A well-known example are Icelandic reflexives whose domain is the smallest indicative sentence.</Paragraph>
    </Section>
    <Section position="3" start_page="246" end_page="247" type="sub_section">
      <SectionTitle>
4.3 The Command Relations of Koster's
Matrix
</SectionTitle>
      <Paragraph position="0"> The local accessibility relations certainly are command relations in our sense. The real problem is whether they are definable using primitive labels of the grammar. In particular the recursiveness of the third accessibility makes it unlikely that we can find a definition in terms of A, V, o. Yet, if it were really an arbitrary iteration of the second accessibility relation it would be completely trivial, because any iteration of a command relation over a tree is the total relation over the tree. Hence, there must be something non-trivial about this domain; indeed, the iteration is stopped if the outer/~ is ungoverned.</Paragraph>
      <Paragraph position="1"> This is the key to a non-iterative definition of the third accessibility relation.</Paragraph>
      <Paragraph position="2"> Let us assume for simplicity that there is a single type of governors denoted by GOV and that there is a single type of opacity element denoted by OP.Y, The first hurdle is the clarification of government.</Paragraph>
      <Paragraph position="3"> Normally, government requires a governing element, i.e. an element of category GOV that is close in some sense. How close, is not clarified in \[Koster, 1986\].</Paragraph>
      <Paragraph position="4"> Clearly, by penalty of providing circular definitions, closeness cannot be accessibility1; really, it must be an even smaller domain. Let us assume for simplicity that it is sisterhood. If then we introduce the modal tX to denote one of my sisters is of category X, being governed is equal to being of category tGOV. Likewise we will assume that the opacity element must be in c-command relation to 6. We are now ready to define the three accessibility relations, which we denote by LA 1, LA 2 and LA 3.</Paragraph>
      <Paragraph position="6"> (Observe that * binds stronger than o.) For a proof consider a point z of a labelled tree T. Let g denote the smallest node dominating both x and its governor and let m be the smallest maximal projection of 9.</Paragraph>
      <Paragraph position="7"> Then x &lt; g _&lt; m. So two cases arise, namely g = m and g &lt; rn. In each cases LA 1 picks the right node.</Paragraph>
      <Paragraph position="8"> Likewise, if o denotes the smallest element containing x and a opacity element that c-commands z, then x &lt; o. Three cases are conceivable, o &lt; g, o = g and o &gt; g. However, if government can take place only under sisterhood, o &lt; g cannot occur. So x &lt; g _&lt; o &lt; m. For each of the four cases LA 2 picks the right node. Finally, for LA s there is an extra condition on m that it be ungoverned.</Paragraph>
      <Paragraph position="9"> Notice that our translation is faithful to Koster's definitions only if the domains defined in \[Koster, 1986\] are monotone. This is by no means trivial. Namely, it is conceivable that a node has an ungoverned element y locally accessible 2, while the highest locally accessible 2 node, z, is governed. In that case (ignoring the opacity element for a moment) the domain of local accessibility 3 of y is z while the domain of z is strictly larger. We find no answer to this puzzle in the book because the domains are defined only for governed elements. But it seems certain that the monotone definition given here is the intended one.</Paragraph>
      <Paragraph position="10"> It should be stressed that GOV and OPY are not specific labels but variables. Their value may change from situation to situation. Consequently, the local accessibility relations are parametrized with respect to the choice of particular governors and particular  opacity elements. As an example, recall the Icelandic case again, where certain anaphors whose domain of accessibility 2 (typically the clause) can be extended in case the opacity element is subjunctive. Following our reduction, the domain of local accessibility 3 is defined by the first maximal projection that is not subjunctive, hence indicative. We take a primitive label IND to stand for is indicative. So, for Icelandic we have the following special domain</Paragraph>
      <Paragraph position="12"> We notice in passing that recent results have put this analysis into doubt (see \[Koster and Reuland, 1991\]) but this is a problem of Koster's original definitions, not of this translation. What is a problem, however, is the standard opacity factor of an accessible subject. While subject (or even SUBJEC~ can be easily handled with a boolean label, the accessibility condition presents real difficulties. First of all it involves indexing and indexes potentially destroy the finiteness of the labelling system; secondly, it is not clear how the accessibility condition (namely, the reqirement that the i/i-Filter is respected after conindexation) can be handled at all in this calculus.</Paragraph>
      <Paragraph position="13"> This issue is too complex to be tackled here, so we leave it for another occasion.</Paragraph>
    </Section>
    <Section position="4" start_page="247" end_page="248" type="sub_section">
      <SectionTitle>
4.4 Translating Koster's Matrix into Rules
</SectionTitle>
      <Paragraph position="0"> In a final step we show how the nearness conditions of the Koster Matrix can be rewritten into rules of a context-free grammar. To be more precise, we show how they can be implemented into any given boolean cfg. The booleanness, of course, is not essential but is here for convenience. We noticed earlier that the domains in cB really are for the purpose of introducing some limited forms of context-sensitivity. If two nodes relate via some dependency relation R then Koster assumes that a certain property is shared.</Paragraph>
      <Paragraph position="1"> But context-free grammars do in principle not allow such a sharing except between mother and daughters and between sister nodes. Nevertheless, as we do not require all properties to be shared but only some it is possible to enrich the grammar in such a way that nodes receive relevant information about parts of the structure that normally cannot be accessed. We will show how.</Paragraph>
      <Paragraph position="2"> First, we will assume that share property is to be understood as a dependency in the labellings between two elements. We simplify this by assuming that there are special features PRPi, i &lt; n, of unspecified nature whose instantiation at the two nodes, 6 and a, is somehow correlated. Since the dependent element is structurally lower than the antecedent, and since generation in cfg's is top to bottom, we assume that it is the dependent element that has to set the PRPI according to the way they are set at the antecedent. The best way to implement this is by a function f that for every assignment prp of the primitive labels at the antecedents gives the labelling f(prp) which the dependent element must satisfy. In order to be able to achieve this correlation in a context-free grammar, the dependent element needs to know in which way the atoms PRPi have been set at a. Thus the problem reduces to a transfer of information from ct to 6. If we generate only fully labelled trees the problem is precisely to transfer n bits of information from tr to 6. The content of this information is of course irrelevant for the formalization.</Paragraph>
      <Paragraph position="3"> To begin with, we need to be able to recognize antecedent and dependent element by their category.</Paragraph>
      <Paragraph position="4"> We do this here by taking two labels ANT and DEP with obvious meaning. Furthermore, one of our tasks is to ensure that the labels X and IX are correctly distributed. Notice, by the way, that it is only for special choices of X that we need these composite elements, so there is nothing recursive or infinite in this procedure. For the sake of simplicity we assume the grammar to be in Chomsky Normal Form; that is, we only have rules ot type X ---* YZ, X --~ Y, X ---* 0 for X, Y and Z atoms or = R (see \[Harrison, 1978\]).</Paragraph>
      <Paragraph position="5"> For any rule p = A ---, BC and any X we distribute the new labels QX and tX as follows. If B _&lt; X but  After having inserted enough ~X and ~X we can proceed to the domains of accessibility. The general problem is as said above, the transfer of information from a to &amp;. The problem is attacked by introducing more modal elements. Namely, for certain g and certain labels X we introduce the new label (g)X. Its interpretation is an element of label X is in my gdomain and neither do I dominate it nor am I dominated by it. If we succeed in distributing these new labels according to their intended interpretation we can code the Koster Matrix into the grammar. We show the encoding for (F)V. It is then more or less evident how (9)X is encoded for a chain g because (b o F)X = (b)(F)X, just as in modal logic. Now for (F)Y there are two cases. (i) The mother node is of category (F)Yn-F. Then the information (F)Y must be passed on to all daughters. (ii) The mother is of category -(F)Y U F. Then a daughter is (F)Y if and only if it has a sister of category Y. Thus at all daughters we simply instantiate (F)Y ~ ~Y.</Paragraph>
      <Paragraph position="6"> It should be quite clear that by a suitable choice of (g)X to be added a dependent element 6 will have access to the information that it has an antecedent in its domain of local accessibility i. If it needs to know what category this antecedent has, this information has to be supplied in tandem with the mere prop-erty that needs to be shared. One snag remains; namely, it may happen that there are more than one antecedent of required type. In that case we need to manipulate the rules of the grammar as follows. As long as we have an element of category ANT we suppress any other antecedents of category ANT within the same domain. This might be not entirely straightforward, but to keep matters simple here we assume that the grammar takes care of that.</Paragraph>
      <Paragraph position="7"> We show now how the translation is completed. For accessibility z we add the following boolean axiom to the grammar (that is, we 'kill' all rules that do not comply with this axiom): (BAR:2)(ANT f'1 prp) 13 I;IGOV lq DEP. --* .f(prp) By choice of the interpretation, this axiom declares that a node which is governed and dependent and has an anetecdent within the next maximal projection must be of category f(prp) if its (unique) antecedent is of category prp. The uniqueness is assumed here to be guaranteed by the grammar into which we encode. Furthermore, note that the assumption that government takes place under sisterhood results in a significant simplification. Limitations of space forbid us to treat the more general case, however. For accessibility 2 this axiom is added instead COPY o BAR:2 A OPY * BAR:2)(ANT n prp) n~GOV n DEP. --~ .f(prp) Finally, for accessibility 3, we have to replace BAR:2 by BAR:217-hGOV.</Paragraph>
      <Paragraph position="8"> More details can be found in \[Kracht, 1993\]. The upshot of this is the following. Suppose that a grammar of some language consists of a basic generative component in form of a cfg 13 and a number of Koster Matrices as additional constraints on the structures.</Paragraph>
      <Paragraph position="9"> If the number of matrices is finite, then finitely many additional labels suffice to create a cfg G + from the original grammar that guarantess that it's output trees satisfy the local conditions of 13 as well as the nearness conditions imposed by the Koster Matrices. Upper bounds on the number of labels of G + (depending both on (3 and the additional matrices) can be computed as well.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML