File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/93/p93-1016_intro.xml
Size: 10,651 bytes
Last Modified: 2025-10-06 14:05:27
<?xml version="1.0" standalone="yes"?> <Paper uid="P93-1016"> <Title>PRINCIPLE-BASED PARSING WITHOUT OVERGENERATION 1</Title> <Section position="3" start_page="0" end_page="114" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> Unlike rule-based grammars that use a large number of rules to describe patterns in a language, Government-Binding (GB) Theory (Chomsky, 1981; Haegeman, 1991; van Riemsdijk and Williams, 1986) ezplains these patterns in terms of more foundmental and universal principles.</Paragraph> <Paragraph position="1"> A key issue in building a principle-based parser is how to procedurally interpret the principles. Since GB principles are constraints over syntactic structures, one way to implement the principles is to 1. generate candidate structures of the sentence that satisfy X-bar theory and subcategorization frames of the words in the sentence. 2. filter out structures that violates any one of the principles.</Paragraph> <Paragraph position="2"> 3. the remaining structures are accepted as parse trees of the sentence.</Paragraph> <Paragraph position="3"> This implementation of GB theory is very inefficient, since there are a large number of structures being generated and then filtered out. The problem of producing too many illicit structures is called overgenera~ion and has been recognized as the culprit of computational difficulties in principle-based parsing (Berwick, 1991). Many methods have been proposed to alleviate the overgeneration problem by detecting illicit structures as early as possible, such as optimal ordering of principles (Fong, 1991), coroutining (Doff, 1991; Johnson, 1991).</Paragraph> <Paragraph position="4"> \] The author wishes to thank the anonymous referees for their helpful comments and suggestions. This research was supported by Natural Sciences and Engineering Research Council of Canada grant OGP121338.</Paragraph> <Paragraph position="5"> This paper presents a principle-based parser that avoids the overgeneration problem by applying principles to descriptions of the structures, instead of the structures themselves. A structure for the input sentence is only constructed after its description has been found to satisfy all the principles. The structure can then be retrieved in time linear to its size and is guaranteed to be consistent with the principles. null Since the descriptions of structures are constantsized attribute vectors, checking whether a structural description satisfy a principle takes constant amount of time. This compares favorably to approaches where constraint satisfaction involves tree traversal.</Paragraph> <Paragraph position="6"> The next section presents a general framework for parsing by message passing. Section 3 shows how linguistic notions, such as dominance and government, can be translated into relationships between descriptions of structures. Section 4 describes interpretation of GB principles. Familiarity with GB theory is assumed in the presentation. Section 5 sketches an object-oriented implementation of the parser. Section 6 discusses complexity issues and related work.</Paragraph> <Paragraph position="7"> 2. Parsing by Message Passing The message passing algorithm presented here is an extension to a message passing algorithm for context-free grammars (Lin and Goebel, 1993).</Paragraph> <Paragraph position="8"> We encode the grammar, as well as the parser, in a network (Figure 1). The nodes in the networks represent syntactic categories. The links in the network represent dominance and subsumption relationships between the categories: * There is a dominance link from node A to B if B can be immediately dominated by A. The dominance links can be further classified according to the type of dominance relationship.</Paragraph> <Paragraph position="9"> * There is a specialization link from A to B if A subsumes B.</Paragraph> <Paragraph position="10"> The network is also a parser. The nodes in the network are computing agents. They communicate with each other by passing messages in the reverse direction of the links in the network.</Paragraph> <Paragraph position="12"> The messages contains items. An item is a triplet that describes a structure: <surface-string, attribute-values, sources>, where surface-string is an integer interval \[i, j\] denoting the i'th to j'th word in the input sentence. attribute-values specify syntactic features, such as cat, plu, case, of the root node of the structure described by the item.</Paragraph> <Paragraph position="13"> sources component is the set of items that describe the immediate sub-structures. Therefore, by tracing the sources of an item, a complete structure can be retrieved.</Paragraph> <Paragraph position="14"> The location of the item in the network determines the syntactic category of the structure. For example, \[NP the ice-cream\] in the sentence &quot;the ice-cream was eaten&quot; is represented by an item i4 at NP node (see Figure 2): <\[0,1\], ((cat n) -plu (nforta norm) -cm +theta), {ix, 23}> An item represents the root node of a structure and contains enough information such that the internal nodes of the structure are irrelevant. The message passing process is initiated by sending initial items externally to lexical nodes (e.g., N, P, ...). The initial items represent the words in the sentence. The attribute values of these items are obtained from the lexicon.</Paragraph> <Paragraph position="15"> In case of lexical ambiguity, each possibility is represented by an item. For example, suppose the input sentence is &quot;I saw a man,&quot; then the word &quot;saw&quot; is represented by the following two items sent to nodes N and V:NP 2 respectively: <\[I,I\], ((cat n) -plu (nform norm)), {}> <\[i,I\], ((cat v) (cform fin) -pas (tense past)), {}> When a node receives an item, it attempts to combine the item with items from other nodes to form new items. Two items <\[iljx\], A~, SI> and <\[i2,j~\], A2, $2> can be combined if 1. their surface strings are adjacent to each other: i2 = jx+l.</Paragraph> <Paragraph position="16"> 2. their attribute values A1 and As are unifiable. 3. their sources are disjoint: Sx N $2 = @.</Paragraph> <Paragraph position="17"> The result of the combination is a new item: <\[ix~j2\], unify(A1, A2), $113 $2>.</Paragraph> <Paragraph position="18"> The new items represent larger parse trees resulted from combining smaller ones. They are then propagated further to other nodes.</Paragraph> <Paragraph position="19"> The principles in GB theory are implemented as a set of constraints that must be satisfied during the propagation and combination of items. The constraints are attached to nodes and links in the network. Different nodes and links may have different constraints. The items received or created by a node must satisfy the constraints at the node. The constraints attached to the links serve as filters. A link only allows items that satisfy its constraints to pass through. For example, the link from V:NP to NP in Figure 1 has a constraint that any item passing through it must be unifiable with (case acc). Thus items representing NPs with nominative case, such as &quot;he&quot;, will not be able to pass through the link.</Paragraph> <Paragraph position="20"> By default, the attributes of an item percolate with the item as it is sent across a link. However, the links in the network may block the percolation of certain attributes.</Paragraph> <Paragraph position="21"> The sentence is successfully parsed if an item is found at IP or CP node whose surface string is the input sentence. A parse tree of the sentence can be retrieved by tracing the sources of the item.</Paragraph> <Paragraph position="22"> An example The message passing process for analyzing the sentence null 2V:NP denotes verbs taking an NP complement. Similarly, V:IP denotes verbs taking a CP complement, N:CP represents nouns taking a CP complement.</Paragraph> <Paragraph position="24"> is illustrated in Figure 2.a. In order not to convolute the figure, we have only shown the items that are involved in the parse tree of the sentence and their propagation paths.</Paragraph> <Paragraph position="25"> The parsing process is described as follows: 1. The item il is created by looking up the lexicon for the word &quot;the&quot; and is sent to the node Det, which sends a copy of il to NP.</Paragraph> <Paragraph position="26"> 2. The item i2 is sent to N, which propagates it to Nbar. The attribute values ofi2 are percolated to i3. The source component eli3 is {i2}. Item i3 is then sent to NP node.</Paragraph> <Paragraph position="27"> 3. When NP receives i3 from Nbar, i3 is com- null bined with il from Det to form a new item i4. One of the constraints at NP node is: if (nform norm) then -cm, which means that normal NPs need to be casemarked. Therefore, i4 acquires -cm. Item i4 is then sent to nodes that have links to NP.</Paragraph> <Paragraph position="28"> ticiple or the passive voice of &quot;eat&quot;. The second possibility is represented by the item i7. The word belongs to the subcategory V:NP which takes an NP as the complement. Therefore, the item i7 is sent to node V:NP. 6. Since i7 has the attribute +pas (passive voice), an np-movement is generated at V:NP. The movement is represented by the attributes nppg, npbarrier, and np-atts. The first two attributes are used to make sure that the movement is consistent with GB principles.</Paragraph> <Paragraph position="29"> The value of np-atts is an attribute vector, which must be unifiable with the antecedent of this np-movement, l~N0aM is a shorthand for (cat n) (nform norm)* 7. When Ibar receives il0, which is propagated to VP from V:NP, the item is combined with i6 from I to form i11.</Paragraph> <Paragraph position="30"> 8. When IP receives i11, it is combined with i4 from NP to form i12. Since ill contains an np-movement whose np-atts attribute is unifiable with i4, i4 is identified as the antecedent of npmovement. The np-movement attributes in i12 are cleared.</Paragraph> <Paragraph position="31"> The sources of i12 are i4 from NP and ill from Ibar. Therefore, the top-level of parse tree consists of an NP and Ibar node dominated by IP node. The complete parse tree (Figure 2.b) is obtained by recursively tracing the origins of i4 and ill from NP and Ibar respectively. The trace after &quot;eaten&quot; is indicated by the np-movement attributes of i7, even though the tree does not include a node representing the trace.</Paragraph> </Section> class="xml-element"></Paper>