File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-1024_metho.xml
Size: 23,000 bytes
Last Modified: 2025-10-06 14:12:07
<?xml version="1.0" standalone="yes"?> <Paper uid="C88-1024"> <Title>A NEW DESIGN OF PROLOG-BASED BOTTOM-UP PARSING SYSTEM WITH GOVERNMENT.BINDING THEORY</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> A NEW DESIGN OF PROLOG-BASED BOTTOM-UP PARSING SYSTEM WITH GOVERNMENT.BINDING THEORY </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> abstract </SectionTitle> <Paragraph position="0"> This paper addresses the problems of movement transformation in Prolog-based bottom-up parsing system. Three principles of Government-Binding theory are employed to deal with these problems. They are Empty Category Principle, C-command Principle, and Subjacency Principle. A formalism based upon them is proposed. Translation algorithms are given to add these linguistic principles to the general grammar rules, the leftward movement grammar rules, and the rightward movement grammar rules respectively. This approach has the following specific features: the uniform treatments of leftward and rightward movements, the arbitrary number of movement non-terminals in the rule body, and automatic detection of grammar errors before parsirlg. An example in Chinese demonstrates all the concepts.</Paragraph> </Section> <Section position="3" start_page="0" end_page="112" type="metho"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> The movement transformation is one of the major problems encountered in natural language processing. It has relation to the empty constituents (traces) that exist at various levels of representation in natural language statements.</Paragraph> <Paragraph position="1"> Consider the following example in Chinese: ~\]\[~, ~ ~ N T o (That book, I read.) The word &quot;N&quot; (read) is a transitive verb, which should take a direct object. However, the object &quot;JJl~&quot; (that book) is topicalized to the first position of the sentence. For the treatment of this phenomenon, we cannot just write down the rules: sentence --> noun-phrase,verb-phrase.</Paragraph> <Paragraph position="2"> verb-phrase --> transitive-verb,noun-phrase.</Paragraph> <Paragraph position="3"> verb-phrase --> transitive-verb.</Paragraph> <Paragraph position="4"> verb-phrase --> intransitive-verb.</Paragraph> <Paragraph position="5"> This is because many ungrammatical sentences will be accepted. Thus, we must provide some mechanisms in the grammars in order to capture them. It is still a hard work to do. Several difficulties are listed as follows: (1) The determination of movement is difficult. That is, an element may be in a topicalization position, but it is not moved from some other place in the sentence. For example, (Fruit, I like.) zk~, ~ ~gt ~o (As for fruit, I like banana.) the first can be considered to be a movement phenomenon, but the second cannot.</Paragraph> <Paragraph position="6"> (2) The empty constituent may exist at many possible positiofis. For example, given an n-word sentence such as Cl Wl e2 w2 e 3 ... e(n_l) W(n-1) e(n) W(n) e(n+l) where w i is the i-th word and e i is an empty constituent, there am (n+l) possible positions from which the moved constituent may originate. That is, for a moved constituent (if there is any), there are so many possible empty constituents to co-index. (3) Since the gap in between the moved constituent and its corresponding trace is arbitrary, it is implausible to list all the possible movements exhaustively, and specify each movement constraint explicitly in the grammars.</Paragraph> <Paragraph position="7"> The Government-Binding (GB) theory \[1\] provides universal principles to explain the movements. Some of them are shown as follows: (1) Empty Category Principle \[7\] A trace must be properly governed.</Paragraph> <Paragraph position="8"> (2) C-command Principle \[7\]a c-commands B iff every branching node dominating a dominates/3.</Paragraph> <Paragraph position="9"> (3) Subjacency Principle \[7\] - null Any application of move - a may not cross more than one bounding node.</Paragraph> <Paragraph position="10"> Summing up the above principles, we have to find a movezl constituent to c-command a trace. The constituent can neither relate to a u'ace out of its c-command domain, nor match a trace when more than one bounding node is crossed. Such principles nan'ow down the searching space to some extent. For example, (El) ~\]l~ tN J~ ~ ti~t'~ ~- ~ To (The student that the man saw t i came.) (E2)*t m~-N t n~N~ ~T ~9~\[~ff~J~o (* The man that the student who t m saw t n came. ) There is a trace t i in the example (El). Two NPs, i.e. &quot;N~&quot; (the student) and &quot;~\]g~lA. &quot; (the man), may co-index with it, but only the former is acceptable. The reason is specified as below: (ZI') In,, \[s\]~. (the man)~ (saw) q }~J (de)~-~i (the student) 1~ (cane)? (asp) L.~o~ (\[~i' ') ~,~the man)\[ n, ,\[st}~ (saw)tl ~ (de}~-i (the student)\]~ (came)\]' (asp) x ~J L.- o.-I The (El&quot;) interpretation violates the subjacency principle (assuming that s and n&quot; are two bounding nodes). Two traces exist in the example (E2). The traces t m and t n may co-index with the two NPs &quot;~J~=&quot; (the student) and &quot;~\]I~{NA.&quot; (the man), and vice versa. However, both are wrong because of the snbjacency principle: i-7- deg--a (E2') .\[s In' latm~ (saw)tn\]~J (de}~ nlthe student)15 (came)~ (asp)\]~ (de\]~\[~)~. m (the man)</Paragraph> <Paragraph position="12"/> </Section> <Section position="4" start_page="112" end_page="112" type="metho"> <SectionTitle> 2. A Government-Bindlng based Logic Grammar </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="112" end_page="112" type="sub_section"> <SectionTitle> Formalism 2.1 The specifications of grammar formalism The Government-Binding based Logic Grammar </SectionTitle> <Paragraph position="0"> (GBLG) formalism is specified informally as follows: (I) the general grammar rules (a) c(Arg) --> Cl(Argl),C2(Arg2),...,cn(Argn).</Paragraph> <Paragraph position="1"> where c(Arg) is a phrasal non-telrninal, and may be also a bounding non-terminal, c.(Arg.) (l<j < n) is, a lexical terminal or J J - _ . a phrasal non-terrmnal.</Paragraph> <Paragraph position="3"> where the definitions of c(Arg) and cj(Argj) (l<j < n) are the same as above, trace('rraceArg) is a virtual non-terminal.</Paragraph> <Paragraph position="4"> The special case i=O is common. For example, a noun phrase is topicalized from a subject position. It is represented as s -- > trace,np.</Paragraph> <Paragraph position="5"> (2) the leftward movement grammar rules c(Arg) --> c 1(Arg I ),c2(Arg2),-&quot;,ei(Argi), m(Argm)<<<traee(TraeeArg), c(i+ 1 )(Arg(i+ 1 )),...,cn(Argn).</Paragraph> <Paragraph position="6"> where the.definitions of c(Arg) and cj(Argj) (l<j < n) arc the same as l(a), m(Argm)<<<trace(TraceArg) is a movement non-terminal.</Paragraph> <Paragraph position="7"> When i=0, the movement non-terminal is the first element in ~he rule body.</Paragraph> <Paragraph position="8"> (3) t he rightward movement grammar rules -</Paragraph> <Paragraph position="10"> Except that the operator '>>>' is used, the other definitions are the same as those in the leftward movement rules. It is apparent because of the uniform treatments of the leftward and the rightward movements.</Paragraph> </Section> <Section position="2" start_page="112" end_page="112" type="sub_section"> <SectionTitle> 2.2 A sample grammar </SectionTitle> <Paragraph position="0"> A sample grammar GBLG1 for Chinese shown below introduces lhe uses of the formalism:</Paragraph> <Paragraph position="2"> Among tlu;se grammar rules, (rl) deals with the leftward movement (topiealization), (r9) treats the rightward movemen ~ (relafivization), and the others am normal grammar rules. The heads of the grammar rules (r3), (r4), (r5), (r7), and (r8) are bounding nodes. The virtual non-terminals traceT(Trace) and traceR(Trace) appear in the rules (r5), (r15), and (1&quot;16).</Paragraph> </Section> <Section position="3" start_page="112" end_page="112" type="sub_section"> <SectionTitle> 2.3 Tramtsitive .relation of c.command theory </SectionTitle> <Paragraph position="0"> For a phrasal non-terminal X, a virtual non-terminal Y and a transitive relation TR, X TRY if (1) X is the rule head of a grammar rule, and Y is an element in its rule body, or (2) X is the rule head of a grammar rule, a phrasal non-terminal I in its rule body, and I TR Y, or (3) there exists a sequence of phrasal non-terminals I 1, 12 ..... I n, such that X TR I I TR 12 TR ... TR I n. The transitive relation TR is also a dominate relation. The c-command theory is embedded implicitly in the GBLGs if ~very grammar rules satisfy the following property: for a rule X 0 --> X1,X2,...,X m where X i is a terminal or a non-terminal, la i ~ m, if Xi=(A<<<B) then there must exist some Xj (i< j < m); such that Xj dominates the virtual non-terminal B in other grammar rule. That is, Xj TR B. The phrasal non-terminal X 0 is the first branching node that dominates A and Xj, and thus also dominates B. Therefore, A c-commands B. Xi=(B>>>A) has the similar behavior. Rules (rl) and (r9) in grammar GBLGI show these <<< and >>> relations respectively.</Paragraph> </Section> <Section position="4" start_page="112" end_page="112" type="sub_section"> <SectionTitle> 2.4 Comparison with other logic programming approaches </SectionTitle> <Paragraph position="0"> Compared with other logic programming approaches, especially the RLGs \[8,9\], the GBLGs have the following features: (1) the uniform treatments of leftward movement and the rightward movement The direction of movement is expressed in terms of movement operators <<< or >>>. The interpretation of movement non-terminals A <<< B or B >>> A is If A is a left moved constituent (or a fight moved constituent), then the corresponding trace denoted by B should be found after (or before) A <<< B (or B >>> A). It is illustrated in the Fig. 1. The two trees are symmetric and the corresponding rules are similar. However, the rules are not similar in RLGs. That is, the two types of movements are not treated in the same way. For the rightward movement, a concept of adjunct node is introduced. It says that the right moved constituent is found if the rule hung on the adjunct node is satisfied. The operation semantics is enforced on the writing of the logic grammars. It destroys the declarative semantics of logic grammars to some extent.</Paragraph> <Paragraph position="1"> (2) the arbitrary number of movement non-terminals in the rule body - In our logic grammars, the number of movement non-terminals in a rule is not restrictive if the rule satisfies the property specified in the last section. The RLGs allow at most one movement non-terminal in their rules. The position of movement non-terminal is declared in the rule head. It is difficult for a translator to tell out the position if different types ~he moved i onst it uont~/~~ Fio. I Symmetric tree for leftward and riohtward movement of elements are interleaved in the rule body. Thus, our formalism is more clear and flexible than RLGs'.</Paragraph> <Paragraph position="2"> (3) automatic detection of grammar errors before parsing For significant grammar rules, a transitive relation TR must be satisfied. The violation of the transitive relation can be found beforehand during rule translation. Thus, this feature can help grammar writers detect the grammar errors before parsing.</Paragraph> </Section> </Section> <Section position="5" start_page="112" end_page="115" type="metho"> <SectionTitle> 3. A Bottom-up Parser in Prolog </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="112" end_page="112" type="sub_section"> <SectionTitle> 3.1 Problem specifications </SectionTitle> <Paragraph position="0"> The Bottom-Up Parsing system (BUP) \[2,3,4\] uses the left-coruer bottom-up algorithm to implement Definite Clause Grammars (DCGs) \[5\]. It overcomes the problems of top-down parsing, e.g. the left-recursive invocation, and provides an efficient way as Earley's and Pratt's algorithms \[3\]. However, it does not deal with the important syntactic problem - movement transformation. Extraposition Grarmnars (XGs) \[6\] propose extraposition lists (x-lists) to attack the movement problem, but when to extract traces from x-lists becomes a new obstacle \[8,9\]. Restricted Logic Grammars (RLGs) \[8,9\] based upon GB try to tackle the unrestricted extraction from the x-list. They emphasize the importance of the c-command and the subjacency principles during parsing.</Paragraph> <Paragraph position="1"> The extraction must obey these two principles. The parsing strategies of XGs and RLGs are all depth-first and left-to-right, thus they have the same drawbacks as DCGs do \[4\]. If the parsing strategy is left-coruer bottom-up, the following issues have to be considered in the translation of GBLGs: (1) the empty constituent problem - null The first element in the rule body, which acts as a left-corner, should not be empty in left-corner bottom-up algorithm. However, the type 1 (b) of rules is common.</Paragraph> <Paragraph position="2"> (2) the transfer of trace information From Fig. 1, we know that the positions of empty constituents are usually lower than those of moved constituents. Because the parsing style is bottom-up, the trace information must be transferred up from the bottom. The conventional different list cannot be applied here. Fig. 2 and Fig. 3 illustrates the differences of data flow between top-down parsing and bottom-up parsing.</Paragraph> <Paragraph position="4"> FiG. 2 the data flow in tile top-down parsing</Paragraph> <Paragraph position="6"> Fig. 3 tho data flow in the bottom~uo parslnQ</Paragraph> </Section> <Section position="2" start_page="112" end_page="114" type="sub_section"> <SectionTitle> 3.2 Data structure </SectionTitle> <Paragraph position="0"> The transfer of trace information is through a list called extraposition list (x-list) and denoted by a symbol H. The transformation of x-list is bottom-up. Fig. 3 sketches the concept. A special data structure shown below is proposed to carry the information: \[In sequence of trace information/X\],X\] Note a mii variable X is introduced. Based upon this notation, an empty list is represented as \[Z,Z\]. An algorithm that merges arbitrary number of lists in linear time is designed:</Paragraph> <Paragraph position="2"> In the conventional list structure such as \[a sequence of trace information\] even though the difference list concept is adopted, the computation time is still in proportion to ml+m2+...+mn, where m i (1< i < n) is the number of elements in the i-th list. Although our merge algorithm is the fastest, it is still a burden on the parsing. In most cases, the predicate merges empty lists. That is nonsense. To enhance the parsing speed, the merge predicate is added in which place it is needed. Observing the merge operation, we can find that it is needed only when the number of lists to be merged is greater than one. The following method can decrease the number of x-lists during rule translation, and thus delete most of the unnecessary merges: Partition the basic elements in the logic grammars into two mutually exclusive sets: carry-H set and non-earry-H set. The elements in the carry-H set may contribute trace information during parsing, and those in the non-carry-H set do not introduce trace information absolutely. The transitive relation TR defined in the section 2.3 tells us which phrasal non-terminals constitute the carry-H set.</Paragraph> </Section> <Section position="3" start_page="114" end_page="114" type="sub_section"> <SectionTitle> 3.3 The translation of grammar rules </SectionTitle> <Paragraph position="0"> The translation of basic elements in the GBLGs are similar to BUPs. Only one difference is that an extra argument that carries trace information may be added to phrasal non-terminal if it belongs to carry-H set. Appendix lists the translated results of the grammar GBLG 1.</Paragraph> <Paragraph position="1"> 3.3.1 The general grammar rules The general grammar rules are divided into two types according as a virtual non-terminal disappears or appears in the rule body: (a) c(Arg) --> Cl(Argl),C2(Arg2),...,cn(Argn)&quot; When c is not a bounding node, e.g. rule (r2), the translation is the same as that in BUP \[2,3,4\], except that an extra argument H (if necessary) for x-list and a built-in predicate merge are added in the new translation algorithm.</Paragraph> <Paragraph position="2"> This predicate is used to merge all the x-lists on the same level. The transformation of x-lists is bottom-up (only one direction) as shown in Fig. 3, Thus, the rule (a) is translated into</Paragraph> <Paragraph position="4"> c(G,Arg,H,Xn,X).</Paragraph> <Paragraph position="5"> When c is a bounding node, e.g. rule (r4), the information is used to check the x-list transferred up. Thus, an extra predicate bound is tagged to this type of rules:</Paragraph> <Paragraph position="7"> fail).</Paragraph> <Paragraph position="8"> A variable Bound which records the cross information is reserved for each element in the x-list. When a bounding node is crossed, this variable is checked tO avoid the illegal</Paragraph> <Paragraph position="10"> where i >= O. The rules (r5) and (r15) are two examples.</Paragraph> <Paragraph position="11"> If the left-coruer bottom-up parsing algorithm is used, the grammar rules should free of empty constituents. When i=0, the grammar rule considers a trace (an empty constituent) to be the first element in the rule body. It overrides the principle of the algorithm, but we can always select the first element c(i + 1) that satisfies the following criterion: (i) a lexieal terminal, or (2) a phrasal non-terminal, or (3) a phrasal non-terminal in a movement non-terminal, to be the left-comer and put the trace inforumtion into an x-list before this non-terminal. Thus, the translation is generalized as follows (assume that cl is a left-comer ).</Paragraph> <Paragraph position="13"> c(G,Arg,H,Xn,X).</Paragraph> <Paragraph position="14"> Here, the trace information is placed between Hi and H(i+l). Summing up, the virtual non-terminal is represented as a fixed fommt: x(trace(l'raceArg),Bound,Direction) and placed into x-list via merge operation. The position in x-list is reflected from the original rule.</Paragraph> <Paragraph position="15"> 3deg3.2 The leftward movement grammar rules The leftward movement grammar rnles can be generalized as below:</Paragraph> <Paragraph position="17"> cn(Argn).</Paragraph> <Paragraph position="18"> The rule (rl) is an example. Its translation is shown as follows:</Paragraph> <Paragraph position="20"> c(G,Arg,H,Xn,X).</Paragraph> <Paragraph position="21"> Comparing this translation with that of general grammar rules, we can find a new predicate cut_ .trace is added. The cut trace implements tile c-command principle, and its definition i~.</Paragraph> <Paragraph position="23"> cut_traceaux(Tracc,X,Y)).</Paragraph> <Paragraph position="24"> The cut trace tries to retract a trace from x-list if a movement exists. ~landarin Chinese has many specific features that oilier languages do not have. For example, topic-comment structure does not always involve movement transformation. The first cut traeeauz chmse matches the trace information with the x-list ~ransferred f,'om the bottom on its right part. The second cut traceatrc tells us that if the expected leftward trace cannot ma~h one of the elements in the x-list, then it will be drop out. The x-list is not changed and transferred up. The concept is demonstrated in Fig. 4. It also explains why we can detect grammar errors before parsing. In summary, each movement non-terminal is decomposed into a phrasal non-terminal and a vimml non-.tet~ninal. The phrasal non-terminal is translated the 'same as before. The vktual non-terminal is represented as x( tr ace(l YaceA r g ),Bound, left ) in this case, however, cut_trace is involved instead of merge.</Paragraph> <Paragraph position="25"> 3.3.3 The rightward movement grammar rules Because we treat the leftward and the rightward movement grammar rules in a uniform way, the translation algorithm of both are similar. The rightward movement gr~n~nar ruks are wifll the following format:</Paragraph> <Paragraph position="27"> if there is a traca the expected in thls range, left-moved the corresDondln Q constltuent moved elemsnt is on the uDDer level</Paragraph> <Paragraph position="29"> a trace should be found in this range if the oxpoctat I on succeeds Flg. 4 the sketch o~ the translation of the leftward production rules</Paragraph> <Paragraph position="31"> c(G,\[Arg\],I-t,Xn,X).</Paragraph> <Paragraph position="32"> The translation is very apparent for the symmetric property of the leftward and the rightward grammar rules illustrated in Fig. 4 and Fig. 5. A slight difference appears in the definition of the the right-moved if there Is a trace constituent in this range, the corraspondlng moved element is sn the upper level Fioo f. 5 the sketch of the translation the rightward production rules predicate cut trace. It shows an important linguistic phenomenon in-Mandarin Chinese: 'Relativization is always a movement transformation.' Thus, if we expect a trace and cannot find a corresponding one, failure is issued. The direction information in x(trace(TraceArg),Bound, right), i.e. fight, tells out the difference between the leftward and the rightward movements. \](n general, we allow both leftward movement and rightward movement to appear in the same rule. A new predicate intersection is introduced to couple these two translations.</Paragraph> </Section> <Section position="4" start_page="114" end_page="115" type="sub_section"> <SectionTitle> 3.4 Invocation of the parsing system </SectionTitle> <Paragraph position="0"> The parsing system is triggered in the following way: goal(a start non-terminal,\[a sequence of arguments\], an empty :~-list,\[a sequence of input string\],\[\]).</Paragraph> <Paragraph position="1"> In GBLG1, the invocation is shown as follows: goal(sl bar,\[ParseTree\],\[Z~I,\[input sentence\],\[\]), Par(Z). Because an empty x-list is represented as \[Z,Z\] (Z: a variable) in our special data structure shown in Section 3.2, var(Z) verifies its correctness. For example, to parse the Chinese sentence &quot;~Jl~ ~ A. ~t~ \[t.~ ~ 3K ?&quot; (the student that that man saw came), we trigger tile parser by calling:</Paragraph> <Paragraph position="3"/> </Section> </Section> class="xml-element"></Paper>