File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-1028_metho.xml

Size: 20,176 bytes

Last Modified: 2025-10-06 14:12:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-1028">
  <Title>MODULARITY~ PARALLELISM~ AND LICENSING IN A PRINCIPLE'BASED PARSER FOR GERMAN</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
MODULARITY~ PARALLELISM~ AND LICENSING
IN A PRINCIPLE'BASED PARSER FOR GERMAN
SEBASTIAN M1LLIES
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> This paper presents a direct implementation of Government-Binding theory in a parser for German, which faithfully models the modular structure of the theory. The modular design yields a flexible environment, in which it is possible to define and test various versions of principles and parameters. The several modules of linguistic theory and the parser proper are interleaved in parallel fashion for early elimination of ungrammatical structures.</Paragraph>
    <Paragraph position="1"> Efficient processing of global constraints is made possible by the concept of licensing, and the use of tree indexing techniques.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Government-Binding theory 1 (henceforth &amp;quot;GB&amp;quot;) seeks to describe human knowledge of language by positing a small number of highly general principles, which interact to produce highly specific effects. Most of these principles are regarded as universal principles. Specific construction types in different human languages result from applying language-particular versions of the universal principles, derived from them by parametrizafion. GB tries to avoid language-particular and construction specific rules. Only recently has the idea of &amp;quot;principle-By that term 1 will mean not only the particular version of the theory set forth in \[ChomSl\], but rather the entire tamily of theories of the principles-and-parameters type inspired by Chomsky's work. based&amp;quot; parsers, which derive structures by deduction from an explicit representation of the principles, come into the focus of attention.</Paragraph>
    <Paragraph position="1"> Importantly, however, GB does not specify any particular relation between the principles and a parser which is supposed to use them. As a consequence, extant GB-parsers reflect the internal organization of GB-theory to varying degrees. This paper reports on an implemen~ tation of a GB-parser for German, which faithfully mirrors the modular structure of (mucb of) GB-theory in the way it represents linguistic knowledge. In discussing the parser, l will presuppose a basic familiarity with GB-theory. 2 According to Mark Johnson (cf. \[John88, John89\]), the most direct relation between a parser and linguistic theory can be observed in a &amp;quot;parsing-as-deduction&amp;quot; approach. Johnson's project is to forntalize linguistics in some suitable subset of first-order logic, arid use this formalization as inpnt for an antomatic theorem prover, such as Prolog, without any intervening recoding. This proposal, however, suffers from some well-known difficulties, such as undecidability, left-recursion (in Prolog), and a tendency to produce generate-and-test algo2 The reader is referred to \[Se185\] for a short introduction. For a detailed di~ussion, see one of the standard texu% e.g. \[LIJ881.</Paragraph>
    <Paragraph position="2"> AcrEs Dr. COL1NG-92. NANTF~S. 23-28 AOm' 1992 I 6 3 PROC. OF COLING-92. N^~rES, Auu. 23-28, 1992 rithms (with modules such as X'-theory and move-c~ as generators, and other parts of grammar as filters). Furthermore, there is no place in the model for those aspects of language processing which do not have to do with knowledge of grammar, but rather with procedural considerations (resolution of ambiguities in PP-attachment and the like).</Paragraph>
    <Paragraph position="3"> Johnson proposes to cope with the difficulty about indeterminacy by using the freezeconstruct (known, e.g. from Prolog-II) to achieve pseudo-parallel execution of generators and tests. The freeze control structure suspends the execution of goals depending on the instantiation of specified variables. This relaxes some of the procedural constraints on the formulation of logic programs, and brings out the logical structure of a program more forcefully. The current approach is similar to Johnson's in that it also uses a formalization of linguistic principles in Horn logic, and executes this formalization in a parallel fashion using freeze. It differs from that approach, in that the principles do not themselves constitute the parser, but rather work in tandem with a specialized module, which implements the procedural aspects of parsing. Indeterminacy in the linguistic component is further reduced by having lexical information constrain X'-theory from being fully productive, and using an extension to the concept of &amp;quot;licensing&amp;quot; (\[Abn86\]) to guide the introduction of empty categories. The total effect is to allow the formalization of the theory to be maximally declarative, and at the same time to ensure decidability of the parsing problem for all possible input. Another key idea is to use clever indexing techniques on trees for the efficient enforcement of conditions on potentially arbitrarily large parts of the parse-tree (e.g., subjacency, or the ECP).</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Implementation of a GB-
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Parser
</SectionTitle>
      <Paragraph position="0"> Figure 1 is a (slightly simplified) schema of the system architecture. The entire system has been programmed on an IBM RT in Quintus-Prolog 2.4 under Unix/AIX. As Quintus does not have a freeze predicate, a recta-interpreter has been implemented to provide one. The interpreter is fully transparent to the grammar designer; in particular, it handles the cut, and knows about Quintus' module concept. The schema makes the modular organization of the system very clear.</Paragraph>
      <Paragraph position="1"> This kind of modularity makes for a great deal of flexibility. The aim of this work is not just to &amp;quot;hardwire&amp;quot; some particular version of GB into a parser, but rather to provide an environment, where different versions of GBtheoretical grammars can be tested and evaluated. In the program, this aim has been approached closely, as the definitions of the principles are not spread out over several components of the grammar, but are textually localized, and procedurally independent from each other and the parsing module. As a consequence, they can be updated or played around with quite easily. The environment also provides tools for facilitating grammar development, such as functions for installing new sets of parameters, a customizable pretty printer, or a small tracing facility. We will now in turn discuss some of the components shown in Figure 1.</Paragraph>
      <Paragraph position="2"> AcrEs DE COLING-92, NANTES, 23-28 ^OI)T 1992 l 6 4 PROC. OF COLING-92. NANTES, AUG. 23-28, 1992</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 The Parsing Module
</SectionTitle>
      <Paragraph position="0"> The parsing module is independent from the rest of the system, and can be exchanged for a different module, implementing a different parsing strategy. In this way, it is possible to model performance aspects of human sentence processing without having to change the declarative representation of lingnistie knowledge as such. The language- and grammardeg independence of the parsing module is manifested by its making use of very general structure-building instructions, which do not mention grammatical notions at all, except on a very high level and in an extremely unspecific manner. All the details of the representation of linguistic knowledge are hidden from the parsing module. Typical instructions are: read the next input word insert a partial tree into the structure that is being built have a maxim~d projection made insert an empty category check local/global grammatical constraints</Paragraph>
      <Paragraph position="2"> Interpreter for prolog with p~ud~paraJlel m~ecution Figure 1 The parser directly reconstructs S-structure. There is no need to view D-structure as a level of representation distinct from S-structure, because D-structural representations are determined on the level of S-structure by the co-indexing of moved constituents with their traces. At present, the parser uses a simple head-driven method of structure building: It proceeds from left to right through the input string, projects every word to the phrasal level, and pushes all projections into a queue until it finds the head of the substructure that is being analyzed, it then inserts this substructure into the analysis tree and tries to empty the queue.</Paragraph>
      <Paragraph position="3"> E.g., while parsing the sentence daft Hans Maria liebt (literally, &amp;quot;that John Mary loves&amp;quot;), the parser will first project daft to CP, push two liPs onto the queue, project liebt to 1P, and then empty the queue. The parser can handle head-complement structures of German. It cannot handle adjunction, which is a serious restriction, to be lifted in later versions of the parser. The types of phenomena currently covered are: Main and subordinate clauses (both V2 and verb-final) nested to arbitrary depth, wh-movement (both direct and indirect questions), infinitives (ECM, Raising, Control), passive, prenominal genitives and adjectival modification, and agreement between determiners, adjectives, nouns, and verbs.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 i,inguistic Knowledge
</SectionTitle>
      <Paragraph position="0"> The following modules of GB-theory have been implenmnted: X'-theory, move-o~, case theory, 0-theory, the projection principle, bounding theory, government theory (specifically, a notion of &amp;quot;barrier&amp;quot; (cf. \[Chom86\]) is included in the definition of the ECP), spec-head-agreement, and spec-headlicensing. X'-theory is constrained to project ACTE, S DE COLING-92, NANTES, 23-28 AOb'T 1992 1 6 5 PROC. OF COLING-92, NANTES, AIrG. 23-28, 1992 only nodes licensed by lexical properties of the head (specifically, subcategorization and 0-marking license the projection of argument nodes in a structure). 3 Linguistic constraints are classified according to their potential domain of application into local constraints (which apply internal to a phrase) and global constraints (which have a potentially unlimited domain of application). Currently, the ECP and the subjacency principle are implemented as examples of global constraints. As for local constraints, there are the Head Feature Principle (similar to GPSG's Head Feature Convention), case-marking, the first half of the 0-criterion (guaranteeing that every argument gets at least one 0-role), L-marking, and spec-headagreement/licensing. All local constraints are enforced immediately after lexical projection has taken place. This is true also for spec-head-licensing relations: These conditions can be locally activated even before anything is known about the actual content of the specifier position.</Paragraph>
      <Paragraph position="1"> They will be explicitly consulted only once: Using the freeze mechanism, they will afterwards be active in the background, parallel fashion, and will prevent the parser from building any unlicensed structure.</Paragraph>
      <Paragraph position="2">  The following parameters can be set: The positions of heads and specifiers relative to the complements, the number and categorial identity of bounding nodes (for subjacency), the number and categorial identity of potential barriers, tile categorial identity of L-marking This is not as ad hoc a solution as it may seem. In linguistic lilerature, it has been suggested several times that phrase-structure is in some way derivative from other notions, such as case- or 0-marking.</Paragraph>
      <Paragraph position="3"> There is no good reason for viewing X'-theory as an unconstrained generator.</Paragraph>
      <Paragraph position="4"> heads and lexical heads, and the possibility of V-to-I (I-to-C) movement. 4 Chain formation and enforcement of global constraints Case is assigned to chains, so that every chain gets exactly one case. Similarly, every chain is assigned exactly one 0-role. These requirements are known as the &amp;quot;case filter&amp;quot; and the &amp;quot;0-criterion&amp;quot; resp. - Chains, however, can  be arbitrarily long, so that these requirements cannot be locally enforced. The same is true of the subjacency principle and the ECP, which constrain the relation between traces and their antecedents. So there are three different questions to answer: 1. Under what circumstances may traces be introduced? 2. How are chains formed? How are the case filter and 0-criterion enforced on chains? 3. How are subjacency and ECP enforced?  As a first step towards answering these questions, let us accept the following condition (taken from \[Abn86\]): A structure is well-formed only if every element in it is licensed. Abney takes licensing relations to be unique (i.e., every element is licensed by a unique relation), lexical, and local (i.e., valid under sisterhood). As we observed, the locality requirement obviously will not do. We will relax it by positing principle (L): (L) Every element in a structure is licensed either locally (in Abney's This is just stipulated by means of a &amp;quot;parameter&amp;quot;. There is no explanation of head-movement in the parser.</Paragraph>
      <Paragraph position="5">  sense), or by locally binding an element which in turn is licensed according to principle (L).</Paragraph>
      <Paragraph position="6"> This gives us a way to answer questions 1. and 2.: Arguments and their traces may be introduced into a structure as long as there is a chance that they will end up as local antecedents of some independently licensed trace. Take the case of 0-assignment: In Figure 2, the trace in SpeclP is licensed by virtue of being a local binder of a trace which is licensed by 0marking, and Hans is licensed by binding tile trace in SpeclP. This is implemented by putting &amp;quot;requests&amp;quot; for 0-roles in a set associated with each element (requests are noted as superscripts in Figure 2). A 0-request in a chain is satisfied by an element that is 0-m~uked. The first half of the 0-criterion, which requires every chain to have at least one 0-role, is thus automatically enforced, by positing: (S) Every request must be satisfied.</Paragraph>
      <Paragraph position="7"> Tile second half of the criterion can be enforced by our putting &amp;quot;offers&amp;quot; for 0-roles on a list as well (subscripts in Figure 2). The offers associated with a chain are determined by multi-set union over the offers associated with the chain elements. We then posit that there may be at most one offer per chain. Now, what about case-marking? Obviously, the case filter is so similar to the 0-criterion as to be amenable to tile same treatment. However, note that treating case-assigmnent as a licensing relation in this way is tantamount to giving up Abney's unique~ ness condition as well. In Figure 2, Hans will be licensed by two relations. A linguist might even want to posit still other licensing relations. So let us imt forward the condition of &amp;quot;relative tnliqucness&amp;quot;; (RIll Every licensing relation must he offm'ed in a chain at most once.</Paragraph>
      <Paragraph position="8"> Taken together, (L), (S), and (RU) answer questions 1. and 2. from above. -5 The solution has been implemented. The actual implenmntation, however, does not follow the inefficient strategy of constructing chains after waiting for locally licensed traces to appear, but l'ather reverses tile process: The parsing module follows a first-fit strategy, inserting elements top-down in the highest possible position, hypothesizing that these elements will be licensed according to principle (L). These hypotheses (i.e., the presence of unsatisfied requests) license the fnrflmr appearance of traces in a chain. This mettlt~ even eliminates the need tbr explicit chain conslruction. Instead, requests arc simply inherited fl'om tile local antecedent down the tree until they are cancelled. 6 Let us tmn to question 3. Ill doing so, let us also consider how expensive it is to check for sub5 R. Frank (IFra90\]) has independently arrived at a similar solution within tim framework of TAGs.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
6 The IllOdlllC \]~'Of chaill COtlSttllCliOrl call b13 seen as an
</SectionTitle>
    <Paragraph position="0"> interpreter exploiting the principles of grammar, which are in this case not used directly in parsing, cf.</Paragraph>
    <Paragraph position="1"> M. Crocker's discussion of this point in \[Cro911.</Paragraph>
    <Paragraph position="2"> ACTES DF. COLING-92. NA~n~S, 23-28 AO{;'r 1992 1 6 7 PROC. Ol; C()LING-92, NANrI~s. AUG. 23-28, 1992</Paragraph>
    <Paragraph position="4"> Figure 3 jacency and antecedent government. We shall see that with an indexing scheme on trees the check can be done in log(n) time, where n is the size of the tree. 7 Let us take subjacency as an example. The idea is to label tire root of the tree with a set of k+l indices, where k is the maximal number of bounding nodes that may be crossed by move-~. Indices are inherited down the tree, such that at every bounding node a new, unique index is added, and the oldest index is not passed downwards. Figure 3 illustrates this. The following is then true: (Subjacency) ~is subjacent to iff the index sets on a and ,/are not disjoint, where 7is the lowest cotumon ancestor of ct and \[L Nodes in the tree have identifiers that specify a path from the root to the node (as there are only binary trees, these paths are given by sequences Indexing ~hemes were originally developed by L. Latecki for the analysis of scope ambiguities and command relations (\[Lat91\]).</Paragraph>
    <Paragraph position="5"> of l's and O's). Thus, finding the h)west common ancestor of two nodes is no harder than selecting the higher of the nodes. Since the cardinality of the index sets is bounded by k+2, the set comparison can be done in constant time. A similar test has been used to implement antecedent government. The freeze -mechanism allows us to uniformly state the instruction for constructing tire correct index sets on every node right after that node has been projected, although the actual property of being a barrier can only be established after the node has found its definitive place in the parse-tree. Antecedent government can be tested even before all global properties of the tree are known. The following piece of code implements antecedent governmcnt (apart from co-indexing). It demonstrates the elegance of our modular approach:</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Conclusion
</SectionTitle>
    <Paragraph position="0"> A modular implementation of a government- \[John89\] binding parser for a considerable fragment of German has been outlined. A new concept of licensing, the use of indexing techniques, and the pseudo-parallel interleaving of a parsing \[LU88\], strategy with a faithful, direct, and declarative representation of GB-flleory have led to a proto typical, tool-box like system for the \[Lat91l development of GB-based grammars. Ttle system has been fidly implemented in Quintus-Prolog. It is hoped that principle-based \[Mi190\] approaches to parsing will help to elucidate the human language faculty, as well as provide a novel focus for the approaches of both theoretical and computatioual linguists. \[Se185\]</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML