File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/ackno/86/c86-1049_ackno.xml
Size: 10,380 bytes
Last Modified: 2025-10-06 13:51:29
<?xml version="1.0" standalone="yes"?> <Paper uid="C86-1049"> <Title>CATEGORIAL GRAMMARS FOR STRATA OF NON-CF LANGUAGES AND THEIR PARSERS</Title> <Section position="2" start_page="0" end_page="209" type="ackno"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> We introduce a generalization of oategorial grammar extending its descriptive power~ and a simple model of oategorial gram.at parser. ~oth tools 08/% be adjusted to particular strata of languages via restricting gralmnatieal or computational complexity'.</Paragraph> <Paragraph position="1"> I. .Two questions about oategprial 6\]ra3,1ars In. spite of the fascinating folnnal simplicity 8/Id lucidity of oategorial grammar as developed by Bar-Hillel \[I\] q~Eunbek \[7\] and followers, it has nevertheless never been brou~'ht into wide scale use. Why' is this so? We may' easily' recognize two drawbacks. null I/ .R.es,t,rieted scope oJ? o~t.eg~o_r!a_l ~r~unmars. It was shown early' \[ I \] that the set of laxts~/ages describable by these g'rarm\[lars is exactly-that of context-free i8/%g~/a~'es. \[Is this restriction inevitable or oa/~ a similar ty'pe of l~%ng%lage description be retained beyond the limit of context-free lan~lages? This is the first question we try' to ~lswer.</Paragraph> <Paragraph position="2"> 2/ No real_is/tic model of oategoria.l grammar par s in g.</Paragraph> <Paragraph position="3"> The schematic description of eategorial analysis of a given sentence a I . deg. * ..a is sketched in Fig'. I . n assign a category' c i i al i2 &quot;'&quot; in to each sentence member a. e I 02 ... e n cancel the string of categories to the target category' t t Fig. 1 This abstract scheme cannot serve as a description of a realistic parsing procedure. The suitable assig~ement appearing here as the first phase is in fact the goal of the parsing. The &quot;brute force&quot; approach following the above scheme, which cheeks all possible assignements and tries to eEu~eel them is not eomputationally' tractable, since for most granmlarS the nul,ber of all possible assignements grows exponentially with the length of the analysed sentence.</Paragraph> <Paragraph position="4"> The moral of this obsel~vation is that the assi~nement oaru~ot be separated from the cancellation. Similarly as parsers based on phrase - str~oture grammars have to make at each point of time an intelligent choice of rule to apply next~ the eategorial parser must m~ke an intelligen~ choice out of a list of alternative oaten'cries. This necessity to look ahead at cancellation when making the assignement leads to the conclusion \[6 \] that assi~nement and cancellation must in any' actual parser be interwoven. Therefore our second key qlles~ion reads: Can this interweaving&quot; be grasped by' a simple formal model or does it unavoidingly lead to ~ mess of complicated ad hoe and heuristic teelmiques? If. Proposed solution We introduce in nontechnical langn/a~'e the essence of the proposed generalization of eategorial gran.nars ~d their parsers. Tile exact mathematical formulations can be found in \[3\].</Paragraph> <Paragraph position="5"> Oranmlars. Tile principal difference between the &quot;classical&quot; eategorial granmlar and the ~eneralizcd cate_gorial 6-rams at (GCG) is that inste~d of finite sets of categories corresponding to terminal syunbols, GCG allows fox, infinite sets of categories.</Paragraph> <Paragraph position="6"> Bach such infinite sot, however~ can be generated b\[.' a @J:mple procedure , in fact procedure9 based on a finite state generator. null Automatadeg We offer list automaton (bA) as a mathematical model of oate~orial ~rEumnar parsing. List automaton is schematically represented by' Fig. 2.</Paragraph> <Paragraph position="8"> LA consists of a nondeterministie finite state oont~ol unit attached to a finite -tape. At the begilminc of the Oolnputation &quot;the tape contains the analysed stringdeg The automaton can read 8/id rewrite so~,~aned symbols and move the soauning head one tape cell to the left or right analogously as Tur:Lng machinedeg In addition to it, it can delete &quot;the scamped oell~ i.e. out it out arid paste the remaining' tape parts to~'ether.</Paragraph> <Paragraph position="9"> In the remainder of the paragraph we list results indicating, as we believe~ that the concepts of G-CO and LA give satisfaeto:cy' auswers be the above questions. a/ ~nd mufb3xal e oxlresppndenc ~. Both GCGs and LA represent exactly all context -sensitive :kan6~u~6&quot;eSo Similarly' like in %he ease of Cl,'-6.r~umnars and pushdo~n automata or oon'bext-sensitive ~'ralmnars and linearly' bounded automata \[5 \] there exist transformations of GCGs to LA and &quot;vice versa: au al~'or.ithm Aj, which for each GCG G y'ields a LA A I (G) representin C the sa.,e 1;luggage ~nd conversely' an algorithm A 2 which for each LA M y'ields an equivalent GCG A2(M ) The next step in our arg~nent is to point out a remarkab'ke feature of the interplay' between GCGs and LA, b/ Stratif'aeation. Tihe correspondence between GCGs and LA ca~ be observed not only' in the whole class of context-sensitive languag'es, but also on the level of CF-lan6~ages and in each of infinitely many' strata between CF a CS-lang~ages. The stratification can be defined via two complexity measures.</Paragraph> <Paragraph position="10"> Or~u3nn~tic~l - pomplexity&quot;: given a GCG C and a string w , the ~rmmnatieal complexity' of w wrt. G , denoted G(w) ~ is defined as 'the lengt\[h of the longest category' used in the aualy'sis wrt. ~ .</Paragraph> <Paragraph position="11"> (For alabi~uous gralmllars~ &quot;the oonlplexity' :ks defined for each \[parse of the string).</Paragraph> <Paragraph position="12"> .C0mpn~ational complexity\[: given a LA M and 6% strin0; W , the computational complexity' of W wrt. ~ denoted M(w) , is defined as the maximal number of visits paid to a sing'le square during the accepting computation (ambiguity being treated as before).</Paragraph> <Paragraph position="13"> In &quot;tile light of these complexity measures we can reconsider the relation between GCGs and LA determilled by' the above mentioned alF;oritl~us A I and A 2 For s.ny' GCG G and\[ any' sentence w , each ~r6Ullmabie~l description of w wrt.</Paragraph> <Paragraph position="14"> is refleeted as a computation of A I (G) accepting w . :File g'r~umnatieal complexity of the description is approximately' the same as the eomputabional complexity' of the corresponding eon,p'atation~ Analogous result holds for &2 &quot; Now, any function f mappin~ natural numbers on natural ntunbers debeT.lines a stratum S (f) of lan~.tla6&quot;es : a langmlag~e L belongs to the stratum 8~ if and only' if it o~n be represented by' a GCG G (Or equivalently a LA M) such that from eaeb sentence w from L of length n , the complexity' G(w) (or M(w)) does not exceed the :numbel ~ f(n) . OuT previous considerations show that the algorithms /11 ' &2 respect the stratifi~ oatio_n. Ilence the introduced tools can be a j~usted te the investigated is/ig~ages, ~l~o exmnples : I/ The ~ran~nams in the s~r~tum S(oonst) (determined by' constant fmletions) are exactly' Bar-I{illel oategorial g'rallmlarS. &quot;Finite visit&quot; LA appear as ~heir parsers. 2/ The la/l~a~es in the strata S(f) ) where f is ~ny' fun.etion of erde~ ~ sm~l-l.er then -the function log(lo~ n) belort~ to &quot;almost eente~t-free lan~a~'es&quot; (of. \[~\]) sharing&quot; e~uoi~l properties of CF-\].ans~a~es. o/ A S_Si.{nenlent grid cancellation inge:~weveno '\];o show %b.at list automata) besides -their simplioity'~ llleet also 'the abo~re :Formulated requirement for natural parsers ef o~te6'oriel ~ramlllar~ We h~'ve to examine at least irrCo~\],ally' in. lllere detail the relavOienshJ.p between a dOG G aaa.d J.~s parser A I (O).</Paragraph> <Paragraph position="15"> Witch the au'tema~o:tl A 1 (G) a.naly'ses a string&quot; al &quot; &quot; deg ~%n ~ then duriTrlg&quot; ~h.e lll--t\]x visit to a square eentainin~ erig'inally a symbol ~i ~ the automaton fixes the m-th symbol in the oate{5o.~y' belong'inu to a i . '\]?hus ai'%er m visits , Ill sy~lbols ef tlhe eateg'o.vy' ~re determineddeg Therefere from the (infinite) se'(; of caret'cries assig~-~able to to a i , enly' those whic\[h a(gree with the determined symbols ~omain il-~ play', To determine the next symbol of a e~%eg'ol~y ', the automaton can cheek the envirorunent of the square and take into account possible oanoe\].lations. At the mOlllent~ when all symbols in a category' are fixed, the corresponding&quot; square :ks deleted. En other words) a oomputation, of A 1 (G) on a str:Lnu a I , . , a n evelves dyzl~mica\].ly' a suitable assi~nement degl&quot;'dege:n ef o~-teg-ories. The irUPormatien used by' the p~rser consists of -- g'eneratin~ mechanism ef categories cerresponding&quot; to particular s3~nbels~ indioatiorls of possible c~noellin~ with neJ.~hbour o~te~ories, The oemp'~tation is oempleted at the moment when the assiG~nement is found.</Paragraph> <Paragraph position="16"> Ill, ~tiens I/ Y~% thi.s brief no-~e we tried to grasp wh~% features of the ex~et mathematical models described in I~ \] we consider to &quot;be f'mzdmnental. We can ima&-Lne ~J.0erilative models d.ifferirtg&quot; in tee\]in.joel d.e%ails but havin~ the somle featuresdeg Which of the medels should \]30 chosen as &quot;e~%nonioal&quot; will require ntore extensive s~udieso 2/ 0~r considerations devil with nonde~ terministio LA, i.eo in fact with &quot;illethod.s&quot; of parsing'. &quot;i~le step from &quot;methods&quot; to &quot;alue~:ktluns&quot; leads frem :13.ondete:PS,lllinistie &quot;to detorministie LA. L~ven a PS:limpse ef the basic str~~'uun S(oenst) promises in-terestin~&quot; results. .M1 o bsezwation of T. l{:i.bbard \[ 4 \] shows %h~~ deterministic &quot;finite visit&quot; LA represent a class of lanG~/ag'os bro~der tha/z the el.ass of deterministic oe~Ttext-free lanuaaG, es. \]\[t implies Lhat deterministko caVe,oriel granmlar (in the elassiezt\], sollse) parsin.c: will (t'o be3~olld ~.he limits Of e.(~..</Paragraph> <Paragraph position="17"> LR-p~rsi~&quot; based o\]% CF~C/:l, anm*ars</Paragraph> </Section> class="xml-element"></Paper>