File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/p88-1030_metho.xml

Size: 21,129 bytes

Last Modified: 2025-10-06 14:12:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="P88-1030">
  <Title>DEDUCTIVE PARSING WITH MULTIPLE LEVELS OF REPRESENTATION.*</Title>
  <Section position="4" start_page="242" end_page="245" type="metho">
    <SectionTitle>
PARSING AS DEDUCTION
</SectionTitle>
    <Paragraph position="0"> As just outlined, GB theory decomposes a competent user's knowledge of a language possessed into two components: (i) the universal component (Univeral Grammar), and (ii) a set of parameter values and a lexicon, which together constitute the knowledge of that i~articular language above and beyond the universal component. The relationship between these two components of a human's knowledge of a language and the knowledge of the utterances of that language that they induce can be formally described as follows: we regard Universal Grammar as a logical theory, i.e. a deductively closed set of statements expressed in a specialized logical language, and the lexicon and rarameter values that constitute the specific knowledge of a human language beyond Universal Grammar as a set of formulae in that logical language. In the theory of of Universal Grammar, these formulae imply statements describing the linguistic properties of utterances of that human language; these statements constitute knowledge of utterances that the parser computes.</Paragraph>
    <Paragraph position="1"> The parsers presented below compute instances of the 'parse&amp;quot; relation, which is true of a PF-LF pair if and only if there is a D-structure and an S-structure such that the D-structure, Sstructure, PF, LF quadruple is well-formed with respect to all of the (pararneterized) principles of grammar. For simplicity, the 'phonology&amp;quot; relation is approximated here by the S-structure 'yield' function. Specifically, the input to the language processor are PF representations and that the processor produces the corresponding LF representations as output.</Paragraph>
    <Paragraph position="2"> The relationship between the parameter settings and lexicon to the 'parse' relation is sketched in Figure 3.</Paragraph>
    <Paragraph position="3"> Knowledge of the Language</Paragraph>
    <Section position="1" start_page="242" end_page="245" type="sub_section">
      <SectionTitle>
Parameter Settings
</SectionTitle>
      <Paragraph position="0"> headfirst.</Paragraph>
      <Paragraph position="1"> specFirst.</Paragraph>
      <Paragraph position="2"> moveslnSyntax(np).  Utterances. It is important to emphasise that the choice of logical language and the properties of utterances computed by the parser are made here simply on the basis of their familiarity and simplicity: no theoretical significance should be attached to them. I do not claim that first-order logic is the 'language of the mind', nor that the knowledge of utterances computed by the human language processor are instances of 'parse' relation (see Berwick and Weinberg 1984 for further discussion of this last poinO. To construct a deductive parser for GB one builds a specialized theorem-prover for Universal Grammar that relates the parameter values and lexicon to the 'parse' relation, provides it with parameter settings and a lexicon as hypotheses, and uses it to derive the consequences of these hypotheses that describe the utterance of interest. The Universal Grammar inference engine used in the PAD parsers is constructed using a Horn-clause theorem-prover (a Prolog interpreter). The Horn-clause theorem-prover is provided with an axiomatization ~/of the theory of Universal  Grammar as well as the hypotheses 9/&amp;quot; that represent the parameter settings and lexicon. Since a set of hypotheses ~rimply a consequence F in the theory of Universal Grammar if and only if H u C/./implies F in first-order logic, a Horn-clause theorem-prover using axiomatization C/2 is capable of deriving the consequences of af that follow in the theory of Universal Grammar. Thus the PAD parsers have the logical structure diagrammed in  theory is the actual Prolog definition of 'parse' used in the PAD1 and PAD2 parsers. Thus the top-level structure of the knowledge of language employed by the PAD parsers mirrors the top-level structure of GB theory.</Paragraph>
      <Paragraph position="3"> Ideally the internal structure of the various principles of grammar should reflect the internal organization of the principles of GB (e.g. Case assigment should be defined in terms of Government), but for simplicity the principles are axiomatized directly here. For reasons of space a complete description of the all of the principles is not given here; however a sketch of one of the principles, the Case Filter, is given in the remainder of this section. The other principles are implemented in a similiar fashion.</Paragraph>
      <Paragraph position="4"> The Case Filter as formulated in PAD applies recursively throughout the S-structure, associating each node with one of the three atomic values ass, rec or 0. These values represent the Case properties of the node they are associated with; a node associated with the property ass must be a Case assigner, a node associated with the property rec must be capable of being assigned Case, and a node associated with the property 0 must be neutral with respect to Case. The Case Filter determines if there is an assignment of these values to nodes in the tree consistent with the principles of Case assignment. A typical assignment of Case properties to the nodes of an S-structure in English is shown in 5, where the Case properties of a node are depicted by the boldface annotations on that node. 1</Paragraph>
      <Paragraph position="6"> The Case Filter is parameterizeci with respect to the predicates 'rightwardCaseAssignment' and qeftwardCaseAssignment'; if these are specified as parameter settings of the language concerned, ~ the Case Filter permits Case assigners and receivers to appear in the relevant linear order. The lexicon contains definitions of the one-place predicates 'noC.ase', &amp;quot;assignsCase' and 'needsCase' which hold of lexical items with the relevant 1 These annotations are reminiscent of the complex feature bundles associated with categories in GPSG (Gazdar et. al. 1986). The formulation here differs from the complex feature bundle approach in that the values associated with nodes by the Case Filter are not components of that node's category label, and hence are invisible to other principles of grammar. Thus this formulation imposes an informational encapsulation of the principles of grammar that the complex feature approach does not.</Paragraph>
      <Paragraph position="7">  property; these predicates are used by the Case Filter to ensure the associations of Case properties with lexical items are valid.</Paragraph>
      <Paragraph position="8"> Specifically, the Case Filter liscences the following structures: (2a) a constituent with no Case properties may have a Case assigner and a Case receiver as daughters iff they are in the appropriate order for the language  concerned, (2b) a constituent with no Case properties may have any number of daughters with no Case properties, (2c) a constituent with Case property C may be realized as a lexical item W if W is permitted by the lexicon to have Case property C, and (2d) INFL' assign Case to its left if its INFL  daughter is a Case assigner.</Paragraph>
      <Paragraph position="9"> This axiomatization of Universal Grammar together with the parameter values and lexicon for English is used as the axiom set of a Prolog interpreter to produce the parser called PAD1. Its typical behaviour is shown below. 2</Paragraph>
      <Paragraph position="11"> Because it uses the SLD inference control strategy of Prolog with the axiomatization of Universal Grammar shown above, PAD1 functions as a 'generate and test' parser.</Paragraph>
      <Paragraph position="12"> Specifically, it enumerates all D-structures that satisfy X'-theory, filters those that fail to satisfy O-theory, computes the corresponding 2 For the reasons explained below, the X' principle used in this run of parser was restricted to allow only finitely many D-structures.</Paragraph>
      <Paragraph position="13"> S-structures using Move-(z, removes all S-structures that fail to satisfy the Case Filter, and only then determines if the terminal string of the S-structure is the string it was given to parse. Since the X' principle admits infinitely many D-structures the resulting procedure is only a semi-decision procedure, i.e. the parser is not guaranteed to terminate on ungrammatical input.</Paragraph>
      <Paragraph position="14"> Clearly the PAD1 parser does not use its knowledge of language in an efficient manner.</Paragraph>
      <Paragraph position="15"> It would be more efficient to co-routine between the principles of grammar, checking each existing node for well-formedness with respect to these principles and ensuring that the terminal string of the partially constructed S-structure matches the string to be parsed before creating any additional nodes. Because the Parsing as Deduction framework conceptually separates the knowledge used by the processor from the manner in which that knowledge is used, we can use an inference control strategy that applies the principles of grammar in the manner just described. The PAD2 parser incorporates the same knowledge of language as PAD1 (in fact textually identical), but it uses an inference control strategy inspired by the 'freeze' predicate of Prolog-II (Cohen 1985, Giannesini et. al. 1986)to achieve this goal.</Paragraph>
      <Paragraph position="16"> The control strategy used in PAD2 allows inferences using specified predicates to be delayed until specified arguments to these predicates are at least partially instantiated.</Paragraph>
      <Paragraph position="17"> When some other application of an inference rule instantiates such an argument the current sequence of inferences is suspended and the delayed inference performed immediately.</Paragraph>
      <Paragraph position="18"> Figure 6 lists the predicates that are delayed in this manner, and the argument that they require to be at least partially instantiated before inferences using them will proceed.</Paragraph>
      <Paragraph position="19">  With this control strategy the parsing process proceeds as follows. Inferences using the X', O,  Case, Move-a and LF-movement principles are immediately delayed since the relevant structures are uninstantiated. The 'phonology&amp;quot; principle (a simple recursive tree-walking predicate that collects terminal items) is not delayed, so the parser begins performing inferences associated with it. These instantiate the top node of the S-structure, so the delayed inferences resulting from the Case Filter, Move-a and LF-movement are performed. The inferences associated with Move-a result in the instantiation of the top node(s) of the D-structure, and hence the delayed inferences associated with the X&amp;quot; and O principles are also performed. Only after all of the principles have applied to the S-structure node instantiated by the &amp;quot;phonology&amp;quot; relation and the corresponding D-structure node(s) instantiated by Move-a are any further inferences associated with the 'phonology&amp;quot; relation performed, causing the instantiation of further S-structure nodes and the repetition of the cycle of activation and delaying.</Paragraph>
      <Paragraph position="20"> Thus the PAD2 parser simultaneously constructs D-structure, S-structure and LF representations in a top-down left-to-right fashion, functioning in effect as a recursive descent parser. This toi&gt;down behaviour is not an essential property of a parser such as PAD2; using techniques based on those described by Pereira and Shieber (1987) and Cohen and Hickey (1987) it should be possible to construct parsers that use the same knowledge of language in a bottom-up fashion.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="245" end_page="246" type="metho">
    <SectionTitle>
TRANSFORMING THE AXIOMATIZATION
</SectionTitle>
    <Paragraph position="0"> In this section I sketch a program transformation which transforms the original axiomatization of the grammar to an equivalent axiomatization that in effect exhibits this 'co-routining' behaviour when executed using Prolog's SLD inference control strategy. Interestingly, a data-flow analysis of this transformed axiomatization (viewed as a Prolog program) justifies a further transformation that yields an equivalent program that avoids the construction of D-structure trees altogether. The resulting parsers, PAD3 - PADS, use the same parameter settings and lexicon as PAD1 and PAD2, and they provably compute the same PF-LF relationship as PAD2 does. The particular techniques used to construct these parsers depend on the internal details of the formulation of the principles of grammar adopted here - specifically on their simple recursive structure - and I do not claim that they will generalize to more extensive formulations of these principles.</Paragraph>
    <Paragraph position="1"> Recall that the knowledge of a language incorporated in PAD1 and PAD2 consists of two separate components, (i) parameter values and a lexicon, and (ii) an axiomatization U of the theory of Universal Grammar. The axiomatization U specifies the deductively closed set of statements that constitute the theory of Universal Grammar, and clearly any axiomatization U' equivalent to U (i.e. one which defines the same set of statements) defines exactly the same theory of Universal Grammar. Thus the original axiomatization U of Universal Grammar used in the PAD parsers can be replaced with any equivalent axiomatization U' and the system will entail exactly the same knowledge of the utterances of the language. A deductive parser using U'in place of U may perform a differer~ce sequence of inference steps but ultimately it will infer an identical set of consequences (ignoring nontermination). null The PAD3 parser uses the same parameter values and lexicon as PAD1 and PAD2, but it uses a reaxiomatization of Universal Grammar obtained by applying the Unfold/Fold transformation described and proven correct by Tamaki and Sato (1984) and Kanamori and Horiuchi (1988). Essentially, the Unfold/Fold transformation is used here to replace a sequence of predicates each of which recursively traverses the same structure by a single predicate recursive on that structure that requires every node in that structure to meet all of the constraints imposed by the original sequence of predicates. In the PAD3 parser the X', @, Move-a, Case and Phonology principles used in PAD1 and PAD2 are folded and replaced by the single predicate 'p&amp;quot; that holds of exactly the D-structure, S-structure PF triples admitted by the conjunction of the original principles.</Paragraph>
    <Paragraph position="2"> Because the reaxiomatization technique used here replaces the original axiomatization of PAD1 and PAD2 with an equivalent one (in the sense of the minimum Herbrand model semantics), the PAD3 parser provably infers  exactly the same knowledge of language as PAD1 and PAD2. Because PAD3's knowledge of the principles of grammar that relate Dstructure, S-structure and PF is now represented by the single recursive predicate 'p' that checks the well-formedness of a node with respect to all of the relevant principles, PAD3 exhibits the 'co-routining&amp;quot; behaviour of PAD2 rather than the 'generate and test&amp;quot; behaviour of PAD1, even when used with the standard SLD inference control strategy of Prolog. 3 PAD3 constructs D-structures, just as PAD1 and PAD2 do. However, a simple analysis of the data dependencies in the PAD3 program shows that in this particular case no predicate uses the D-structure value returned by a call to predicate 'p' (even when 'p' calls itself recursively, the D-structure value returned is ignored). Therefore replacing the predicate 'p' with a predicate 'pl' exactly equivalent to 'p' except that it avoids construction of any D-structures does not affect the set of consequences of these axioms. 4 The PAD4 parser is exactly the same as the PAD3 parser, except that it uses the predicate 'pl' instead of &amp;quot;p', so it therefore computes exactly the same PF - LF relationship as all of the other PAD parsers, but it avoids the construction of any D-structure nodes. That is, the PAD4 parser makes use of exactly the same parameter settings and lexicon as the other PAD parsers, and it uses this knowledge to compute exactly the same knowledge of utterances. It differs from the other PAD parsers in that it does not use this knowledge to explicitly construct a D-structure representation of the utterance it is parsing.</Paragraph>
    <Paragraph position="3"> This same combination of the Unfold/Fold transformation followed data dependency analysis can also be performed on all of the principles of grammar simultaneously. The 3 Although in terms of control strategy PAD3 is very similiar to PAD2, it is computationally much more efficient than PAD2, because it is executed directly, whereas PAD2 is interpreted by the meta-interpreter with the 'delay&amp;quot; control structure.</Paragraph>
  </Section>
  <Section position="6" start_page="246" end_page="247" type="metho">
    <SectionTitle>
4 The generation of the predicate &amp;quot;pl' from
</SectionTitle>
    <Paragraph position="0"> the predicate 'p' can be regarded an example of static garbage-collection (I thank T. Hickey for this observation). Clearly, a corresponding run-time garbage collection operation could be performed on the nodes of the partially constructed D-structures in PAD2.</Paragraph>
    <Paragraph position="1"> Unfold/Fold transformation produces a predicate in which a data-dependency analysis identifies both D-structure and S-structure values as ignored. The PAD5 parser uses the resulting predicate as its axiomatization of Universal Grammar, thus PAD5 is a parser which uses exactly the same parameter values and lexicon as the earlier parsers to compute exactly the same PF-LF relationship as these parsers, but it does so without explictly constructing either D-structures or S-structure~ To summarize, this section presents three new parsers. The first, PAD3, utilized a reaxiomatization of Universal Grammar, which when coupled with the SLD inference control strategy of Prolog resulted in a parser that constructs D-structures and S-structures 'in parallel', much like PAD2. A data dependency analysis of the PAD3 program revealed that the D-structures computed were never used, and PAD4 exploits this fact to avoid the construction of D-structures entirely. The techniques used to generate PAD4 were also used to generate PADS, which avoids the explicit construction of both D-structures and Sstructures. null CONCLUSION.</Paragraph>
    <Paragraph position="2"> In this paper I described several deductive parsers for GB theory. They knowledge of language that they used incorporated the to W level structure of GB theory, thus demonstrating that parsers can actually be built that directly reflect the structure of this theory.</Paragraph>
    <Paragraph position="3"> This work might be extended in several ways. First, the fragment of English covered by the parser could be extended to include a wider range of linguistic phenomena. It would be interesting to determine if the techniques described here to axiomatize the principles of grammar and to reaxiomatize Universal Grammar to avoid the construction of D-structures could be used on this enlarged fragment - a program transformation for reaxiomatizing a more general formulation of Move-ct is given in Johnson (1988b).</Paragraph>
    <Paragraph position="4"> Second, the axiomatization of the principles of Universal Grammar could be reformulated to incorporate the 'internal' deductive structure of  the components of GB theory. For example, one might define c-command or goverment as primitives, and define the principles in terms of these. It would be interesting to determine if a deductive parser can take advantage of this internal deductive structure in the same way that the PAD parsers utilized the deductive relationships between the various principles of grammar.</Paragraph>
    <Paragraph position="5"> Third, it would be interesting to investigate the performance of parsers using various inference control strategies. The co-routining strategy employed by PAD2 is of obvious interest, as are its deterministic and non-deterministic bottom-up and left-corner variants. These only scratch the surface of possibilities, since the Parsing as Deduction framework allows one to straight-forwardly formulate control strategies sensitive tO the various principles of grammar. For example, it is easy to specify inference control strategies that delay all computations concerning particular principles (e.g. binding theory) until the end of the parsing process.</Paragraph>
    <Paragraph position="6"> Fourth, one might attempt to develop specialized logical languages that are capabale of expressing knowledge of languages and knowledge of utterances in a more succinct and computationally useful fashion than the first-order languages.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML