XML Viewer - p84-1027

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/84/p84-1027_metho.xml
Size: 19,055 bytes
Last Modified: 2025-10-06 14:11:36
<?xml version="1.0" standalone="yes"?>
<Paper uid="P84-1027">
  <Title>The Semantics of Grammar Formalisms Seen as Computer Languages</Title>
  <Section position="4" start_page="123" end_page="124" type="metho">
    <SectionTitle>
2. Denotational Semantics
</SectionTitle>
    <Paragraph position="0"> In broad terms, denotational semantics is the study of the connection between programs and mathematical entities that represent their input-output relations. For such an account to be useful, it must be compositional, in the sense that the meaning of a program is developed from the meanings of its parts by a fixed set of mathematical operations that correspond directly to the ways in which the parts participate in the whole.</Paragraph>
    <Paragraph position="1"> For the purposes of the present work, denotational semantics will mean the semantic domain theory initiated by Scott and Strachey \[20\]. In accordance with this approach, the meanings of programming language constructs are certain partial mappings between objects that represent partially specified data objects or partially defined states of computation. The essential idea is that the meaning of a construct describes what information it adds to a partial description of a data object or of a state of computation.</Paragraph>
    <Paragraph position="2"> Partial descriptions are used because computations in general may not terminate and may therefore never produce a fully defined output, although each individual step may be adding more and more information to a partial description of the undeliverable output.</Paragraph>
    <Paragraph position="3"> Domain theory is a mathematical theory of considerable complexity. Potential nontermination and the use of functions as &amp;quot;first-class citizens&amp;quot; in computer languages account for a substantial fraction of that complexity. If, as is the case in the present work, neither of those two aspects comes into play, one may be justified in asking why such a complex apparatus is used. Indeed, both the semantics of context-free grammars mentioned earlier and the semantics of logic grammars in general can be formulated using elementary set theory \[7,21\].</Paragraph>
    <Paragraph position="4"> However, using the more complex machinery may be beneficial for the following reasons:  * Inherent partiality:, many grammar formalisms operate in terms of constraints between elements that do not fully specify all the possible features of an element. null * Technical economy, results that require laborious constructions without utilizing domain theory can be reached trivially by using standard results of the theory. null * Suggestiveness: domain theory brings with it a rich mathematical structure that suggests useful operations one might add to a grammar formalism.</Paragraph>
    <Paragraph position="5"> * Eztensibilit~. unlike a domain-theoretic account, a specialized semantic account, say in terms of sets, may not be easily extended as new constructs are added to the formalism.</Paragraph>
    <Paragraph position="6"> 3. The Domain of Feature Structures null  We will start with an abstract denotational description of a simple feature system which bears a close resemblance to the feature systems of GPSG, LFG and PATR-II, although this similarity, because of its abstractness, may not be apparent at first glance. Such feature systems tend to use data structures or mathematical objects that are more or less isomorphic to directed graphs of one sort or another, or, as they are sometimes described, partial functions. Just what the relation is between these two ways of viewing things will be explained later. In general, these graph structures are used to encode linguistic information in the form of attribute-vahm pairs. Most importantly, partial information is critical to the use of such systems--for instance, in the variables of definite clause grammars \[12\] and in the GPSG analysis of coordination \[15\]. That is, the elements of the feature systems, called fealure struclures (alternatively, feature bundles, f-structures \[2\], or terms} can be partial in some sense. The partial descriptions, being in a domain of attributes and complex values, tend to be equational in nature: some feature's value is equated with some other value. Partial descriptions can be understood  in one of two w:ays: either the descriptions represent sets of fully specilied elements of an underlying domain or they are regarded as participating in a relationship of partiality with respect to each other. We will hold to the latter view here.</Paragraph>
    <Paragraph position="7"> What are feature structures from this perspective? They are repositories of information about linguistic entities. In domain-theoretic terms, the underlying domain of feature structures F is a recursive domain of partial functions from a set of labels L (features, attribute names, attributes) to complex values or primitive atomic values taken from a set C of constants. Expressed formally, we have the domain equation F=IL~F\]+G The solution of this domain equation can be understood as a set of trees (finite or infinite} with branches labeled by elements of L, and with other trees or constants as nodes. The branches la .... , Im from a node n point to the values n{lt),..., n(Im) for which the node, as a partial function, is defined.</Paragraph>
  </Section>
  <Section position="5" start_page="124" end_page="126" type="metho">
    <SectionTitle>
4. The Domain of Descriptions
</SectionTitle>
    <Paragraph position="0"> What the grammar formalism does is to talk about F, not in F. That is, the grammar formalism uses a domain of descriptions of elements of F. From an intuitive standpoint, this is because, for any given phrase, we may know facts about it that cannot be encoded in the partial function associated with it..</Paragraph>
    <Paragraph position="1"> A partial description of an element n of F will be a set of equations that constrain the values of n on certain labels. In general, to describe an element z E F we have equations of the following forms: (... (xII. })-..)ll;.) = (..-(z(li,))...)(l;.) (&amp;quot;.(x{li,))&amp;quot;.)(li,~) = ck , which we prefer to write as</Paragraph>
    <Paragraph position="3"> with x implicit. The terms of such equations are constants c E C' or paths {ll, &amp;quot;. It=), which we identify in what follows with strings in L*. Taken together, constants and paths comprise the descriptors.</Paragraph>
    <Paragraph position="4"> Using Scott's information systems approach to domain construction \[16\], we can now build directly a characterization of feature structures in terms of information-bearing elements, equations, that engender a system complete with notions of compatibility and partiality of information.</Paragraph>
    <Paragraph position="5"> The information system D describing the elements of F is defined, following Scott, as the tuple D = (/9, A, Con, ~-) , where 19 is a set of propositions, Con is a set of finite subsets of P, the consistent subsets, I- is an entailment relation between elements of Con and elements of D and A is a special least informative element that gives no information at all. We say that a subset S of D is deductively closed if every proposition entailed by a consistent subset of S is in S. The deductive closure -S of S ___ /9 is the smallest deductively closed subset of/9 that contains S.</Paragraph>
    <Paragraph position="6"> The descriptor equations discussed earlier are the propositions of the information system for feature structure descriptions. Equations express constraints among feature values in a feature structure and the entailment relation encodes the reflexivity, symmetry, transitivity and substitutivity of equality. More precisely, we say that a finite set  of equations E entails an equation e if * Membership: e E E * Reflezivit~. e is A or d = d for some descriptor d * Symmetry. e is dl = d2 and dz = dl is in E * Transitivity. e is da = dz and there is a descriptor d such that dl = d and d = dz are in E * Substitutivit~r. e is dl = Pl * d2 and both pl = Pz and</Paragraph>
    <Paragraph position="8"> With this notion of entailment, the most natural definition of the set Con is that a finite subset E of 19 is consistent if and only if it does not entail an inconsistent equation, which has the form e~ = cz, with et and Cz as distinct constants.</Paragraph>
    <Paragraph position="9"> An arbitrary subset of/9 is consistent if and only if all its finite subsets are consistent in the way defined above.</Paragraph>
    <Paragraph position="10"> The consistent and deductively closed subsets of D ordered by inclusion form a complete partial order or domain D, our domain of descriptions of feature structures.</Paragraph>
    <Paragraph position="11"> Deductive closure is used to define the elements of D so that elements defined by equivalent sets of equations are the same. In the rest of this paper, we will specify elements of D by convenient sets of equations, leaving the equations in the closure implicit.</Paragraph>
    <Paragraph position="12"> The inclusion order K in D provides the notion of a description being more or less specific than another.</Paragraph>
    <Paragraph position="13"> The least-upper-bound operation 12 combines two descriptions into the least instantiated description that satisfies the equations in both descriptions, their unification. The greatest-lower-bound operation n gives the most instantiated description containing all the equations common to two descriptions, their generalization.</Paragraph>
    <Paragraph position="14"> The foregoing definition of consistency may seem very natural, but it has the technical disadvantage that, in general, the union of two consistent sets is not itself a consistent set; therefore, the corresponding operation of unification may not be defined on certain pairs of inputs. Although this does not cause problems at this stage, it fails to deal with the fact that failure to unify is not the same as lack of definition and causes technical difficulties when providing rule denotations. We therefore need a slightly less natural definition.</Paragraph>
    <Paragraph position="15"> First we add another statement to the specification of the entailment relation:  * Falsitv. if e is inconsistent, {e} entails every element of P.</Paragraph>
    <Paragraph position="16"> - That is, falsity entails anything. Next we define Con to be  simply the set of all finite subsets of P. The set Con no longer corresponds to sets of equations that are consistent in the usual equational sense.</Paragraph>
    <Paragraph position="17"> With the new definitions of Con and I-, the deductive closure of a set containing an inconsistent equation is the whole of P. The partial order D is now a lattice with top element T = P, and the unification operation t_l is always defined and returns T on unification failure.</Paragraph>
    <Paragraph position="18"> We can now define the description mapping 6 : D --* F that relates descriptions to the described feature structures. The idea is that, in proceeding from a description d 6 D to a feature structure f 6 F, we keep only definite information about values and discard information that only states value constraints, but does not specify the values themselves. More precisely, seeing d as a set of equations, we consider only the subset LdJ of d with elements of the form</Paragraph>
    <Paragraph position="20"> This description mapping can be shown to be continuous in the sense of domain theory, that is, it has the properties that increasing information in a description leads to nendecreasing information in the described structures {monotonieity) and that if a sequence of descriptions approximates another description, the same condition holds for the described structures.</Paragraph>
    <Paragraph position="21"> Note that 6 may map several elements of D on to one element of F. For example, the elements given by the two sets of equations</Paragraph>
    <Paragraph position="23"> describe the same structure, because the description mapping ignores the link between (f h) and (g i) in the first description. Such links are useful only when unifying with further descriptive elements, not in the completed feature structure, which merely provides feature-value assignments.</Paragraph>
    <Paragraph position="24"> Informally, we can think of elements of D as directed rooted graphs and of elements of F as their unfoldings as trees, the unfolding being given by the mapping 6. It is worth noting that if a description is cyclic---that is, if it has cycles when viewed as a directed graph--then the resulting feature tree will be infinite2 Stated more precisely, an element f of a domain is finite, if for any ascending sequence {d~} such that f E_ U~ d~, there is an i such that f C_ d~. Then the cyclic elements of D are those finite elements that are mapped by 6 into  nonfinite elements of F.</Paragraph>
    <Paragraph position="25"> 5. Providing a Denotation for a</Paragraph>
    <Section position="1" start_page="125" end_page="126" type="sub_section">
      <SectionTitle>
Grammar
</SectionTitle>
      <Paragraph position="0"> We now move on to the question of how the domain D is used to provide a denotational semantics for a grammar formalism.</Paragraph>
      <Paragraph position="1"> We take a simple grammar formalism with rules consisting of a context-free part over a nonterminal vocabulary .t/= {Nt,..., Ark} and a set of equations over paths in (\[0..c~\]- L*)0C. A sample rule might be</Paragraph>
      <Paragraph position="3"> This is a simplification of the rule format used in the PATR-II formalism \[18,17\]. The rule can be read as &amp;quot;an S is an NP followed by a VP, where the subject of the S is the NP, its predicate the VP, and the agreement of the NP the same as the agreement of tile VP'.</Paragraph>
      <Paragraph position="4"> More formally, a grammar is a quintuple G = (//, S, L, C, R), where * ,t/is a finite, nonempty set of nonterminals Nt,..., Nk * S is the set of strings over some alphabet (a fiat domain with an ancillary continuous function concatenation, notated with the symbol .).</Paragraph>
      <Paragraph position="5"> * R is a set of pairs r = (/~0 ~ N,, .. . N,., E~), where E. is a set of equations between elements of</Paragraph>
      <Paragraph position="7"> As with context-free grammars, local ambiguity of a grammar means that in general there are several ways of assembling the same subphrases into phra.ses. Thus, the semantics of context-free grammars is given in terms of sets of strings. The situation is somewhat more complicated in our sample formalism. The objects specified by the grammar are pairs of a string and a partial description.</Paragraph>
      <Paragraph position="8"> Because of partiality, the appropriate construction cannot be given in terms of sets of string-description pairs, but rather in terms of the related domain construction of powerdomains \[14,19,16\]. We will use the Hoare powerdomain</Paragraph>
      <Paragraph position="10"> pairs. Each element of P is an approximation of a transduetion relation, which is an association between strings and their possible descriptions.</Paragraph>
      <Paragraph position="11"> We can get a feeling for what the domain P is doing by examinin~ our notion of lexicon. A lexicon will be an SMote precisely a rational tree, that is, a tree with a finite number of distinct subtrees.</Paragraph>
      <Paragraph position="12">  element of the domain pk, associating with each of the k nonterminals N;, I &lt; i &lt; k a transduction relation from the corresponding coordinate of pk. Thus, for each nonterminal, the lexicon tells us what phrases are under that non-terminal and what possible descriptions each such phrase has. llere is a sample lexicon: NP:</Paragraph>
      <Paragraph position="14"> By decomposing the effect of a rule into appropriate steps, we can associate with each rule r a denotation Ir~ :P~ --. pk that combines string-description pairs by concatenation and unification to build new string-description pairs for the nonterminal on the left-hand side of the rule, leaving all other nonterminals untouched* By taking the union of the denotations of the rules in a grammar, (which is a well-defined and continuous powerdomain operation,) we get a mapping</Paragraph>
      <Paragraph position="16"> from pk to pk that represents a one-step application of all the rules of G &amp;quot;in parallel.&amp;quot; We can now provide a denotation for the entire grammar as a mapping that completes a lexicon with all the derived phrases and their descriptions. The denotation of a grammar is the fimetion that maps each lexicon ~ into the smallest fixed point of To containing e. The fixed point is defined by i=O as Tc is contimmus.</Paragraph>
      <Paragraph position="17"> It remains to describe the decomposition of a rule's effect into elementary steps. The main technicality to keep in mind is that rules stale constraints among several descriptions (associated with the parent and each child), whereas a set of equations in D constrains but a single description. This nfismateh is solved by embedding the tuple (do,..., d,,) of descriptions in a single larger description, as expressed by</Paragraph>
      <Paragraph position="19"> and only then applying the rule constraints--now viewed as constraining parts of a single description. This is done by the indexing and combination steps described below. The rest of the work of applying a rule, extracting the result, is done by the projection and deindcxing steps* The four steps for applying a rule r = (N,, --* U,, . .. N,.., E,) to string-description pairs (s,,d,} ..... (sk,dk} are as follows. First, we index each d,, into d~ by replacing every * . . * . $ * path p m any of tts equatmns with the path I &amp;quot; P. We then combine these indexed descriptions with the rule by unifying the deductive closure of E, with all the indexed descriptions: d= u Ud{, j=l We can now project d by removing from it all equations with paths that do not start with O. It is clearly evident that the result d o is still deductively closed. Finally, d o is deindexed into deo by removing 0 from the front of all paths O. p in its equations. The pair associated with N,o is then</Paragraph>
      <Paragraph position="21"> It is not difficult to show that the above operations can be lifted into operations over elements of pk that leave.</Paragraph>
      <Paragraph position="22"> untouched the coordinates not mentioned in the rule and that the lifted operations are continuous mappings* With a slight abuse of notation, we can summarize the foregoing discussion with the equation \[r\] = deindex o projecl o combine, o index, In the case of tile sample lexicon and one rule grammar presented earlier, \[G~(e) would be</Paragraph>
      <Paragraph position="24"> VP: S: {... as before.- .} {--. as before-..} (&amp;quot;Uther storms Cornwall&amp;quot;, {(subj agr nnm} = sg .... }) (&amp;quot;many knights sit at the Round Table&amp;quot;, {(sub 1 agr hum) = pl .... }) (&amp;quot;many knights storms Cornwall&amp;quot;, T)</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML