File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/69/c69-4801_abstr.xml
Size: 24,879 bytes
Last Modified: 2025-10-06 13:45:44
<?xml version="1.0" standalone="yes"?> <Paper uid="C69-4801"> <Title>CONTEXTUAL GRAMMARS</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> CONTEXTUAL GRAMMARS </SectionTitle> <Paragraph position="0"> In the following, we shall introduce a type of generative grammars, called contextual grammars. They are not comparable with regular grammars- But every language generated by a contextual grammar is a context-PSree language. Generalized contextual grammars are introduced, which may generate non-cox,text-free languages.</Paragraph> <Paragraph position="1"> Let V be a finite non-void set ; V lary. Every finite sequence of elements in ia called a vocnbu-V is said to be a string on V. Given a string x = ala2...an , the number n is called the length of x. The string of length zero is called the n tring and is denoted by r~J . Any set of strings on V is called a language on V. The set of all strings on V (the null-string inclusively)is called the universal language on V. By a n- we denote the string a...a, where a is iterated n times.</Paragraph> <Paragraph position="2"> Any ordered pair (u,v~ of strings on V_ is said to be a contex~ on V. The string x is admitted by the context <u,v> With respect to the language L if u~ G L.</Paragraph> <Paragraph position="3"> Let .~ be a finite set of strings on the vocabulary V~ and let@be a finite se@ of contexts on V. The triple (v,~, ~)) (1) is said to be a contextual l~rammar ; V is the vocabulary of the grammar, ~ is the ba_s_e_ of the grammar and ~is the co m~,-2- null textual ccmoonent of the grammar.</Paragraph> <Paragraph position="4"> Let us denote by ~ the contextual grammar defined by (1). Oonsider the smallest language L on Vj fulfilling the following two conditiom8 (~J Iz ~ and <u,v>,(~), th-- ~=,L.</Paragraph> <Paragraph position="5"> The language L is said to be the lsmguage generated by the contextual grammar G. This means that the language generated by G is the intersection of all languages L fulfilling the conai~ions (~) and (pj .</Paragraph> <Paragraph position="6"> A language ~L is said to be a eonteF~ual language if there exists a contextual grammar G which generates L. Proposition i. Eyer~ finite language is a cont~ual lan-Proo__f. Let V be a vocabulary and let ~ be a finite lan. guage on V. It is obvious that the contextual grammar (V,L4jO), where 9 dauotes the void set of contexts, gauerates the language L I. The same language may be gamerated by means of the .g (r) contextual grammar (V,I~ , where is formed by the nu~ cont ex~ only.</Paragraph> <Paragraph position="7"> Two contextual grammars are called e~uivalemt if they gemssame language. The grammars CV,LI, O ) and (V,~,~ rate the are equivalent, since they both generate the language ~ The converse of Proposition 1 is not true. Indeed, we have Proposition 2. The universal language is a contextual language. null ProOf. Let V = ~alLa2,...~ ~. De~ote by I~ the umiver. Sal language on V.'T.et us put ~&quot; S~.~.~ and t <~''i&quot; &quot; C -3~,a~, ,...s(~,ian> ~ It is easy to see that thegrammar .</Paragraph> <Paragraph position="8"> (V,~ generates the universal language on V.</Paragraph> <Paragraph position="9"> Remarks. If we put, in the proof Of Proposition 2, LI-V instead of h =~' then the grammar (V_,h,@) does not generate the universal language on V, since the language it generates dDes not contain the nu/l-strlng.</Paragraph> <Paragraph position="10"> In order to illustrate the activity of the grammar (V,~, ~defined in the of let consider the proof proposition 2, US particular case when the vocabulary iS formed by two elements only : V =(a.b~. The general form of a string x on V is x = a ~b ~a-~b~...a ~b~ , where il, Jl , i2,j2,...,~,j ~ are arbitra~non-negative integers. In order to generate the string x, we start with the null.string @@ and we apply il times the context ~ ,a~ . The result of this operation is the string a 11 ,to which we apply Jl times the context <~,b> and obtain the string al~ ~I . Now we apply i~ times the context <~,a> , than J2 ti~es the context ~,b_> and we continue so alternatively. ~hen, ~dter 2p-2 steps, we have obtained the J = ai bJlai b 2ooo LP&quot; ib -i , it is ply .~ ti~es ~ne context ~@,~ and, to the string so obtained, jp times ~:e contex~ ~gb> , in order to generate completely Uhe string Xo Haskell Curry considered Ghe larlg~age L = {abn~ (n=l,2~..o) as a model of ~he set of natural numbers \[5~ o We call L the language of Curry.</Paragraph> <Paragraph position="11"> Prooosiuion 3. The language of Curry is a contextual langu~eo null proof. The considered language is generated by the grammar (V,LI,~) , where V= ~a.b~ , I~ =~a 3 -nd~ ~<~,b~.</Paragraph> <Paragraph position="12"> We recall that a language is Said to be regular if it may be generated by means of a finite aatoma$on (or, equivalently, by means of a finite state grammar in the sense of Ohomsky). Proposition 4. There exis$~ a contextual language which is Proof. Let us consider the language L = ~a-nb n} (n=l,2,...) If we put V = {adegb} , L 1 = ~ab~ and ~ ~<a.b>} , then it is easy to see that L is generated by the con~extuai grammar (V, LI, ~. On the other hand, L is n~t a regular language. This fact was assel~ed by Ghomsky in \[3~ and\[~\], but the proof he gives is wrong. A correct proof of this assertion and a.discussion of Chomsky' s proof were given in \[~\]. and ~. Propositions 2,3 and # show that there are many infinite languages w~ioh are oontextual. This fact may be explained by means of P~posi~ion 5. If the set ~ is non-void and if the set~ contains at least one non-nu/1 contex~ I ~hen the contextual gramma___r_r (V.Ll, ~ ~enerate s an infinite language.</Paragraph> <Paragraph position="13"> Proof. Since L A is non-void, we may find a string x be@ longing to ~i o Since contains, at least one non-nu/\] context, @ * let ~u,v~ be a non-null context belonging to . l~rom these assumptions, we infer that the strings , u2xv 2, . *. ,un~, ,..</Paragraph> <Paragraph position="14"> are mutually distinc~ and belong all to the language generated by the grammar (V,I~,). Thus, ~his language is infinite. The converse of Proposition 5 is true. Indeed, we have</Paragraph> <Paragraph position="16"> Proposition 6. If the contextual 6rammar (V, LI~ gau~ rates an infillite language, then ~Ll. is non-void, whereas@ c0nrains a no~-nult context.</Paragraph> <Paragraph position="17"> proof. Let L be the language generated by ~V)LI~. If is void, L is void too, hence it cannot be infinite. If contains no non-null context, we have L = L I. But ~d is in any ease flni~e ; ~hus, L is finite, in contradictiom With the h vpothesis. null Since there are contextual language which are not regular (see Proposition 4 above), it would be interesting to establish whether all contextual languages are context-free ls~guages. The amswer is affirmative : Proposition 7. _Every contextua~ lan?.ua~e is a context-free PrP~oof. Let b be a contextual language. If L is finite, it is a regular language. But i~ is well knowm that every regular language is a context-free language. Therefore, L is a context-free language. Nowe let us suppose that L is infinite. Deao~e by G = (V,,L l, a contextual grammar which generates the language L, In view of Proposition 6, L I is non-void,wheream there exists an integer i, l~ i~p , such that the con~ext ~ui,vi~ is non-nu~ Joe. at least one of the equalities ui =co , v i =~ is false. Let us make a choice a~d suppose tha@ ~.i ~ ~ Let L~ = {xcA,xp_, ... 9~a} and (~) ={<ultVl> , ~.,U,.~,V.&quot;y} . We define a context-free grammar ~)..@| as follows. The terminal vocabulary of ~ is V. The non-terminal vocabulary of ~ contains one element only- denoted by It is obvious tha~ the number of terminal rules is equal to the number of strings in ~ , whereas the number of non-terminal rules is precisely the number of conuexts in ~. Among the non-terminal rules, there is one at least which is non-trivial : it is the rule S ---> UiS v~. , where '*4 {~ &quot; It is not difficult to nrove that the grammar ~ generates the given language L. Indeed, the general form of a string in L__ is where yG V and <~i , V~ >E~ for s = 1,2,...,p.</Paragraph> <Paragraph position="18"> In order to generate the considered string we begin by applying --Jl $imes She rule In this way, we obtain the expression N 1 )t~... , -7! l Oontinuing in this way, we arrive, after pression ,~z &quot;ia ~-i s J~-\]. J2 ~a us-1 ~,a .... .'$1 - &quot;-%'i &quot;'&quot; ~12 &quot;.% &quot; Vie now apply j.p times the rule and thus we obtain the expression p-1 steps, to the ex--Jp.~ .~!2-1 . Ja ':ll .J.l. ,.i2. u jp-1 ~ S - &quot;Ull '~i2 .... ~-l . V~p ~-l &quot;&quot; vi2-V-il where, by applying the terminal rule S---@ Z, the considered string is completely generated, Thus, we have proved that L is contained in the language generated by ~ , Conversely, let z be a string generated by ~ . The general form of this generation involves sev(ral consecutive applications of non-terminal rules (the number of these applications may be eventually equal to zero) followed by one and only one application of a terminal rule. It is easy to see that the result of this generation * is always a string of the form (2). Thus we have proved that the language generated by ~ is contaiued in L, In view of the precedim~ eonsiderations, L iS precisely the language generated by ~ o languages which are not contextual l~guages. For instance, ~he l~~uage of Kleene~an~ (m=~,2,...), the first example of an infinite language which is not regular, is a very simple example of ~ contextual language. It is enough to remark that the sequence ~n2} (~ = 1,2,...) contains nO subsequaucewhioh is an infinite ari~hmebio progression ~ (We have (n+l)2-n22~+l and lira (~+i)=~, therefore for every subsequence of ~n2} the difi'erance of two consecutive terms has the limiu equal to +oo wh~ n-@ ~ ). But a result of\[4\] asserts, among others, that given am infinite contex~-f1~ee lan~ guage L, the set of integers which represent the len~hs of the strings in L contains an infinite arithmetic progression. It follows uhab b~Je language of Kleene is not context-free and, in view of Prooosition 7, it is not a conbex~ual language. T~ sa-~ ~a~T ~@l{ow* ~,~ ~h~-,~ :3.A,~. (r)~ \[g'J, ~,,#C/. A natural question now arrises : Do there exist non-contextual languages a~ong context-free languages 7 The affirmative answer follows fro~ the following remark : The converse of Proposition 7 is not true. Indeed, we have Prooositiou 8. There exists a. cont e.~-free language which is not a contextual language.</Paragraph> <Paragraph position="19"> Proof. Let V = ~a,b~. In view of a theorem of Gru~Lkl ~ ~__.-----~there exists, for every positive integer _n~ a context-free language I~ on V, such that every context-free grammar of I~ contains at least n non-terminal symbols. But, as we can see in the proof of Proposition 7, every contextual language may be generated with a context-free grammar containing only one non-~erminal symbol. Therefore, if _n ~ 2, ~ is not a contex~usl l~guage. Proposition 8 suggests the natural question whe~bsr ~bere exist regular languages which are not contextual lan~ages. The</Paragraph> <Paragraph position="21"> answer is affirmative : Pronosition 9- There exists a regular language which is not a context ual language.</Paragraph> <Paragraph position="22"> Proof~ Let us consider the laugaage L = {abm-~c~a.~ n,) ~,n= =1,2,...), which was used b~ H.B.Curry \[5\], in order to descrlbe the set of mathematical (true or not) propositions. This language is regular, since it can be generated by the rules S--> Abj Ac->Ab, A.--~ Ba, B--> CC. , G_--~ ~ , C--~ Db, .D_--~ a, We shall show that .L is not a contextual language, Tndeed, let us admit that the contrary holds and let G = <V,~, ~> be a contextual grammar of L_ 2 Here, the gene_.-al form of a string in L is</Paragraph> <Paragraph position="24"> &quot;'',Pn are arbitrary positive integers. This means that ul,~2,...</Paragraph> <Paragraph position="25"> .deg---,_Un , Vl,Y2,-..,v ~ in the expression (3) are formed only by those elements of V whnse number of occurences in the strings of L is unlimited. Only h satisfies this requirement. It follows that in any string of .L both occurrences of sand the occurence of ~ are terms of the string x in (3). But this implies that the intermediate terms between the occurrences of a are terms of x, hence we can find two strings y and , m l The string y is obvioasly .the null-string ~o the form 1~. , hence &quot; z such that ,whereas z is of But m may be here an arbitrary positive integer. Therefore, since -lo-X6~ , it follows that ~ is an infinite se~ of mtrimgs. This fact contradicts the assus~tion concern_tug G ! v, is ~t a contextual language and Proposition 9 is proved.</Paragraph> <Paragraph position="26"> The contextual grammars may be generalized in order to generate some lauguages which are not context-free. A generalized contextual ~r~mmar is a quadruple G =~ , ,L2, ~ , where V, L I and ~have the same meal~g as in bhe definition of a contextual grammar, whereas J'2 is a finite set of strings on the vocabulary V. We define the language L G generabed by G in the following way : Y~ is a language on V a~d xe~ if and only if we may e~press x in the form .where z~, y~Le , <ui,Yi>~for i : 1,2,...,n and pl,P2,...,pn , p are positive integers such that pl+P2..,~n=p. Every language generated by a generalized contextual grammar is said to be a generalized contextual lsnguage.</Paragraph> <Paragraph position="27"> I~, in the delini~ion of G, we take L~ =~c~}, G is equivalent ~o a contextual grammar ! the lang,.% is then precisely the language generated by the contextual grammar ~V,LI~.In_ deed, the general form of a string in the contextual language ge-</Paragraph> <Paragraph position="29"> We may consider a conte~ual grammar as a parbicular case of generalized contextual grammar, .by ideatifyimg the contextual grammar ~Y='~1~ with the generalized contextual grammam~,V,,~, &quot; It is interesting to point out that somet~imes a cont~ual language may be easy generated by a generalized contextual grammar which is not a contextual grammar. For instance, let. us consider the l~.~e L= (~=} (~X,2,...) . ~ ~is, or the proof of ProDosition A, L is a contextual language. We map generate L by the generalized contextual grammar (which is not a co=textual ~r~a~) <v, h~> , where v : {~,b}, .Li_\[c~}, ~ = Ibm, ~= \[a,~ . It is known that ,~_ is not re-S~L%ag. We ma~ give a similar example, wi~h a language which is regular. In this respect let us consider the language of G~x~V~.~. In view of Proposition 5, it is a contextual language. It is a regular language too~ since it may be generated by the regular gramm~r contain~ ~he following two rules : q--~ Sb and S--> a.</Paragraph> <Paragraph position="30"> Now let us consider the generalized contextual grammar < ~i' This grammar generates the language of Curlew, but if'is not a cent ext,~ al gran~nar.</Paragraph> <Paragraph position="31"> ~ow let us show that generalized contextual languages are an effective generalization of contextual languages.</Paragraph> <Paragraph position="32"> Propo .sit ion ii. Th ere _ exist s a_g en=e=~ iaed_gA~nt ext ua! language which is ~IQ~ a eon~ext~sl language, ~, Let us consider the language T, = PSan_b.n~.. n} (n:=-l,2,. 4 It is known that this language is not context-free (see,PSor instance,66\] ,p.~). 7n view of Proposition 7, every contextual - 12language is a context-free language ; hence~ ~ is not a con. textual language. Now let us consider the generalized contextual gr~m~ G = <V,~,~2,~>, .here v = PS~,~ , ~ ~, ,~{~ and~ ~(~a>~ . It is easy to see that G generateSS the ieaguage L.</Paragraph> <Paragraph position="33"> Yrom the proof of Proposition ll it follows immediately; Proposition 12. There exists a ~eneralized contextual lang u_~e which is not a ~.nnteYt-f~ee language.</Paragraph> <Paragraph position="34"> We may now ask whether the converse of Proposition 12 is true. The answer is given by Proposition 13. Th ere exists a cont ext-free~a~e~ even a regular language,~ which is.,not a generalized contextual !~ua~e.-~ P#oof. We may consider the language L = ~sbmc_abn-} (~,n= =1,2,...) used in the proof of Proposition 9. It was showed in the proof of Proposition 9 that L is regular. Let us admit that ~ is a generalized contextual language. Given a string x in L, its representation is of the form Pl P2 Pn P P~ P2Pl ~m ~ : ui u~ ....~n..~. y'v.&quot; ... v2v i where ~ui,vi~ ~ (i = 1 .... ,n),ZG~, y~L2,pl+...+pn = p end G = ~V, L1,L~, ~ is the grsmmar of L. By a reasoning similar to &quot;that used in the proof of Proposition 9, we find that for every positive integer m there exists a string z in \]i I such that z = abmcab s~, where s is a non.negative integer, depending mf m. But thls means that ~ eontain~ infinitely ma~ strips. This fact con... tradicts the definition of a generalized contextu~ grammar. It</Paragraph> <Paragraph position="36"> follows that L is not a generalized contextual language, It is to be expected ~hat every generalized contextual language is a contex~-s~itive language. But the construction of the corresponding context-sensitive grammar seems to be very complicated, if we thin~ to the generation of the language ~u.A.~reider has introduced a new type of grammars, called gralamatlkl) and defined i~ neighborhood ~ira~.L~ars (okrestnostnye ' the following way (\[4o); see ~4\]. Our presentation is some what different). Given a finite set V called vocabulary, two strings x and y on V, and a context <u,v> on V, We say that the pair ~u.v> ,y) is a neighborhood of y with respect to x if we can find two strings z and w, such that x=zu~vw.</Paragraph> <Paragraph position="37"> Every pair of the i or~ ~<u,v> , ~\] , where ~u,v> is a context on ~, Whereas y is a string on V, is called a neighborhood on V. Let us consider an element e which does not belong to V ; G will be called the bo~3dary element. A neighborhood grammar is a triple of the form ~ V, e ,~, where V is a vocabulary, is the boundary element and ~is a finite set of neighborhoods on the vocabulary VU(e} . Let L be a l~aguage on V. 2e say that L is generated by the considered neighborhood grammar if ~i every string x of the form x =~ye (with ymL).and only in such strings - there ~ists in ~, far every tera a i of X=~la2...a s , a neighborhood of a i with respect to x.</Paragraph> <Paragraph position="38"> Neighborhood gray, mrs are closely related to the notion of context, since this notion occurs in the definition of a neighborhood. There is another notion, due to Ja.p.L.Vasilevski~ and - 14 ~.V.Ghom~ak6v (see ~he refermnce in~2\],p,~o), which e~lains this fact. Following these authors, a grammar of contexts (this name is imp_roper, since no context occurs among its objects) is a triple <V, e ,9> , where .V and @ have the s-me meaning as in the definition of a neighborhood grammar, whereas Q is a finite set of strings on the vocabulary Vt3{e~ * This grammar generates the language _L on V in the following way : x6 if and only if for every string y and a~y strings z and w for which there exist strings u and v such that @ x@ = = uzyuv we have either l) y = rasp , where sE Q, whereas the strings m and p may be eventually or 2) (~x@ = urynt, where qr = z , n t = w mad ryn is a string belong~g to Q.</Paragraph> <Paragraph position="39"> A string belonging to Q is said to be closed from t~ le~ (from the right) if its first (last) term is @ . A string belonging to Q is said to be ~ if it is closed bosh from the left and f~m the right.</Paragraph> <Paragraph position="40"> A grammar of contexts is said ~o be k-bounded if every non-closed string of _~ is of length _k, whereas every Clesed string of ~ is of length not greater than _kj An important theorem of BorSS~ev asserts the equivalance between languages generated by neighborhood grammars and languages generated by k-bounded grammars of con~s (~PS3,p.4o). Since grammars of contexts and contextual grammars have some similarities in their definitions, it is Interesting to establish more ~xac~ly the relation b~een them.</Paragraph> <Paragraph position="41"> v - 15 Proposition 14. There exists a contextual language ~hioh is regular, but which is not a neighborhood language.</Paragraph> <Paragraph position="42"> Proof. Let us consider the language L = ~a~n~ (n=l,2,...). This language is regular, since it is generated by the regular grammar consisting in the rules S ~ ~a, T--->Ua , U--->Ta, --->a, where ~ is the start symbol, La~ is the terminal vocabulary, whereas {S,T,U} is the non-terlainal vocabulary. Let us consider the contextual gramnu~r G =~ {a} ,{CO}, {~a,a>~. I@ is easy to see that G generates the language ~ $ therefore L is a contextual language.</Paragraph> <Paragraph position="43"> We shall show that L is not a neighborhood language. In this respect, our method will be the following. We shall consider all systems of possible neighborhoods of the terms of ~he string 0aae and we shall show t~}at every such sysbem is either a system of aeighborhoods of the ~erms of every string Cane (n= = 2,3,4,...) or it is not a system of nei@\]borhoods of the terms 0t the string ea@e . It is easy to see that the first ~erm of the string @aa@ admits ~he following neighborhoods : 1)e , 2) Ca, 3) @aa, ~) eaa~ . The second term has the neighborhooas : l) G_a,~)-a~) aa , 4) ~e ., 5) e_a_a , 6) e_~ae. The neighborhoods of the third term are : i) e@a , . 2) aa, ~) a, ~) _ae , 5) eaa8 , 6)_aaE) . The lass term has the neighborhoods - 1) 8 , 2) _a~_ , 3) a a@_ , 4) @aa~ . The noration _u_xv. represents hier the neighborh~d {<u,v> ,x} . It is easy to see that the fourth neighborhood of the firs~ and of the lass term c~t b'e a neighborhood of e with respect @o @g48 . On She other hand, a is a neighborhood of .aa with respect to ea~@ for every n = 1,2, .... It follows that no - 16 neighborhood grammar of L = ~a2ZX 3 may contain one of She neigh. borhoods _0a2@ , Q a2~ and a. Thus, if a neighborhood grammar of \]~ exists, it contains at leas~ one neighborhood from every group of the following four groups of neighborhoods : ~) _0 , _~a, _ea 2 .</Paragraph> <Paragraph position="44"> ~) e_a, _aa, _aae , G~a , ~.aaO.</Paragraph> <Paragraph position="45"> b') 6~-~, aa, ..aO , ea_a~ , agO.</Paragraph> <Paragraph position="46"> We shall consider all possible combinations betweau a neighborhood of the group ~ and a neighborhood of the group E . By mn we shall denote the combination formed by the m.th neighborhood of ~ and the n-th neighborhood of ~ . It is easy to see Chat every neighborhood grammar containing one of the combinations 12, 22, 23, 25, 42 generates a language whioh eontain~ every string a n with n $ 2. On the other hand~ every neighborhood grammar containing one of the combinations ll, 13, 14, 15, 21, 24, 31, 32, 33, ~, 35, 41, 43, 44, 45, 51, 52, 53, 5~, 55 generates a language which either does not contain the string a 4 or contains every string a n with n~ 2 * (This depends on the fact if the neighborhoods aa or aa belong or not to the considered neighborhood grammar). Thus, there exists no neighborhood grammar which generates the language ~2n 3.</Paragraph> <Paragraph position="47"> But the definition of (generalized) contextual grammars, though adequate to the investigation of the generative power of purely contextual operations, does not correspond to ~he situation existing in real (natural or artificial) l~guages, where every string is admired only by some contexts and every o~u~ / - 17admits only some strings. Let us try to obtain a type of grammar corresponding to this more complex situation. We define a con___y textual grammar with choice as a system G_ =<V,L,~ ,~o>, where V, L1 and~are the objects of a contextual grammar, whereas is a mappi~ defined on the universal language on V and havi~ the values in the set of subsets of~. We define the language generated by G as the smallest language L having the follow1 deg ~ L l x ~ L 2 deg ing properties : If x , ! If ye L, <u,y>6 ~(y) and Z&~l, then u~L, z v~L and ~L. Thus, every strin~ chooses some contexts and every context chooses some strings. We define a contextual language with choice a language which is generated by a contextual grammar wit~oioe. The investigation of these grammars and languages would better show the generative power of contextual operations, in a manner which corresponds to the situation existing in real languages.</Paragraph> <Paragraph position="48"> --~$ -</Paragraph> </Section> class="xml-element"></Paper>