File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/85/p85-1019_metho.xml
Size: 26,209 bytes
Last Modified: 2025-10-06 14:11:47
<?xml version="1.0" standalone="yes"?> <Paper uid="P85-1019"> <Title>Semantic Caseframe Parsing and Syntactic Generality</Title> <Section position="2" start_page="0" end_page="155" type="metho"> <SectionTitle> 1 The Plume Parser </SectionTitle> <Paragraph position="0"> Recent work at Carnegie-Mellon Umvers=ly eg. \[4. 51 has sinown semanttc caseframe =nstant~ation to be a n,ghly robust and efficient method of parsing restricted domain ~n0ut. In tn~S approach ~0 parsing, a caseframe grammar contains lhe doma~n-soecific semantic informat=on, ana&quot; the pars=ng program contains general syntact=c knowledge Input ,s mapped onto me grammar using m=s budt-~n syntact=c knowledge We nave chosen m=s approach for Plume &quot;:'M a commercial restricted domam parser ~ because of ~ts advantages =n efficfency and robustness Let us take a simple example from a natural language interface, called NLVMS. thai we are developing under a 1 More 0:eccselv. Phjme TM ,s me n4me ,)t lne run-hltle ~vstem TM assoclalecl N~m Language Craft an mlegralerJ envlrollmenl for me oevetoomenl of naluraI language ,nteHaces &quot;he PlUm? 10arser Nnlch transla{es Eng#lsl'l lil~UI qnto casetrame ,nslances, .s a malot comDoneiIt :marker to I into I in l onto) \] This defines a caseframe called &quot;copy&quot; w~th mree cases: file-to-copy, source, and destination The hie-to-copy case ,s filled by an oioiect of type &quot;file&quot; and appears =n the input as a direct oblect Source ,s filled 0y a &quot;d~rectory&quot; and should appear in me ~nput as a preposmonal phrase preceded or marked by the prepos,t~ons &quot;from&quot; or 'out of&quot; Oestinat=on is filled by a &quot;file&quot; or &quot;clirectory&quot; and ~s marked by &quot;to'. &quot;into'. or &quot;onto&quot; Finally the copy command itself is recognized by the header word ,ndicated above (by header) as &quot;copy&quot; Using mis caseframe. Plume can parse ,n0uts like: Copy fop Oar out ot \[x/ ,nro \[y~ From \[x\] to \[yJ cooy fop oar too oar coDy /rom \[x/ ro \[y/ In essence. Plume's parsing algorithm +S tO find a caseframe header, in this case &quot;copy&quot; and use the associated caseframe, &quot;copy&quot; to guide the rest of the parse. Once the caseframe has been identified Plume looks for case markers, and then parses the associated case filler directly following the marker Plume also tnes to parse pomtionally specified cases, like direct ObleCt. in the usual position in the sentence - immediately following the header for direct object. Any input not accounted for at the end of this procedure is matched against any unfilled cases, so that cases that are supposed to be marked can be recognized without their markers and pos=tionally indicated cases can be recognized out of their usual positions, This flemble.</Paragraph> <Paragraph position="1"> interpretive style of matching caseframes against the input allows Plume to deal with the kind of variation in word order illustrated in the examples above.</Paragraph> <Paragraph position="2"> The above examples implied there was some method to recognize files and directones They showed only atomic file and directory descriptions, but Plume can also deal with more complex ObleCt descnptions In fact, in Plume grammars, obiects as well as actions can be described by caseframes. For instance, here =s the caseframe s used to define a file for NLVMS This caseframe allows Plume to recogn,ze file descriptions like: 6 fop fop.Par The file created Oy John The fortran file in ix/ created Oy Joan The caseframe notation and parsing algorithm used here are very similar to those described above for clause level input. The significant differences are additions related to the :adiective and :assignedp attributes of some of the cases above. While Plume normally only looks for fillers after the header in nominal caseframes an adiective attnbute of a slot tells Plume that the SlOt filletmay appear before the header.</Paragraph> <Paragraph position="3"> An :assignedp attribute allows cases to be filled through recognition of a header+ This is generally useful for proper names, such as fop and foo.bar. In the example above.</Paragraph> <Paragraph position="4"> the second alternatwe header contmns two '.,ar~ables name and 'extension. that can each match any s=ngJe .vorcI. The ClUeSt=on mark Indicates opt=onal~ty, so that me header can be either a single word or a word followed Dv a per=pal and another word. The first wOrd ,s asmgned to the ~'anaOle 'name. and IRe second (if =t =s mere~ to the vanaOle !extension If 'name or 'extension are matched ,,vnde recognizing a file header, their values are placed ,n the name and extenmon cases of &quot;hie&quot; w,ln the above mod,ficat,ons P~ume can parse nomqna, caseframes umng the same algor~ttnm that ~t uses for clausal caseframes that account for complete sentences. However there are some interactions between the two levels of parsing. In particular, mere can be ambiguity about where to attach marked cases* For anstance. In: Copy me fortran file ,n \[,:/ to \[y/ &quot;~n \[xr&quot; could e,her fill the directory case of the hie described as 'the fortran hie or could fill the dest+natBon case of the whole copy command. The second interpretation does not work at the global level because the only place to put &quot;to \[y}&quot; ,s tn that same destination case However. at the time the file descrlpt,on ts parsed, tins information is not avadable and so both possible attachments must be considered In general, if Plume is able to fill a case of a nora,hal caseframe from a prepositional phrase, it also splits off an alternative parse in which that attachment is not made. When all input has I~een parsed. Plume retains only t~ose parses t~at succeed at the global level, i.e.. consume all of the input. Others are discarded.</Paragraph> <Paragraph position="5"> The current implementation of Plume is based on the nominal and clausal level caseframe instant=ation algorithms descnPed above. Us=ng these algor=thms and a restr=cted clommn grammar of caseframes like the ones ShOWn above.</Paragraph> <Paragraph position="6"> Plume can parse a w~de variety of ~mDerat~ve and declarative sentences relevant to that doma=n. However.</Paragraph> <Paragraph position="7"> there remain significant gaps ,n ~ts coverage. Interrogatives are not handled at all: + passives are covered only if mey are explicitly specified =n the grammar ancl relative clauses can only be handled by pretending they are a form of prepos=t=onal phrase The regular and predictable relattonsn~p between s~mple statements. C/~uestions and relalwe clauses and between act=ve and passive sentences ~s ,veil known A parser wmcil purports to tnterpret a dohlaln specific tanguage specification using a built-in knowledge of symax ShOuld account for tills regularity =n a general way The current implementer=on of Plume ilas no mecnamsm for doing tn~s. Eacil ~ndividual possiDdity for questions relative clauses and passives must be explicitly specified ,n the grammar For instance, to handle reduced relative clauses as =n &quot;the file created by jim .... created by&quot; ~s hSted as a case marker (compound prepositlorll tn the creator slot of file. mark+ng a description of the creator To handle full relat=ves the case marker must be specified as something hke &quot;3(which < be >) created by&quot;. '3 Wh=ie mis allows Plume to recognize +the file which was created by Jim&quot;, &quot;the file created by Jim&quot;. or even &quot;the file created by Jim on Mondav ~t breaks down on something like &quot;the file created on Monday by Jim ' because the case marker &quot;created by' {s no longer a unll Moreover using the current techniques. Plume S abdtly to ?rhR Curren! ,rnoleftl~nt;~llon ,)1 PIIIIII@ &quot;.* a }s -.~ lef/l~,) r,~tV t'nF, iI'l,)d OI ,, ,I ,.aseft,)me ,, 1 .-. ~ .t i11 la ii ,-~ ~1 recognize the above inputs =s completely unrelated tO =ts abdity tO recognize inputs like: the fi/e Jim created on Mon(Tay the person that the file was crearect ov on Monday the day on which Jim created rne me If an interface could recogmze any of these examptes +t might seem unreasonable to a uSer that ~t could not recognize all of the others Moreover g~ven any of the above examples, a user might reasonaPly expect recogmt=on of related sentence level inputs hke Create the hie on Monday' J~m created the hie on Monday Dt~ J~m create the hie on Moneay ? Was the hie create(l Ioy J~m on Monclay ~ Who created the hie on Monday ? What day was the hie created on? The current ,mplememation of Plume has no means of guaranteeing such regularity of coverage. Of course, this problem of patcl~y syntactic coverage is not new for restricted doma=n parsers. The lack Of syntactic generality of the original semantic grammar {3\] for the Sophie system {21 led tO the concept of cascaded ATNs {10} and the RUS parser {1 I, A progress=on w=tln s=milar goals occurred from the LIFER system \[91 to TEAM {6\] and KLAUS \[7\]. The bas=c oDstacle to ach~evmg Syntactic generality ~n these network-based approaches was me way syntactic and semantic information was m=xed together +n the grammar networks. The sOlutions, therefore, rested on separating the syntact=c and semanttc reformat=on. Plume already incorporates just me separation of syntax and semantics necessary for syntactic generahly general syntactic knowledge resides in the parser whde semantic =nformat=on resides ~n the grammar This suggests that syntactic generahty ~n a System like Plume can be acnreved Qv ,morowng the parser s caseframe ,nstanttatJon algOrithms .vHnou{ 3n~,. malor changes to arammar Content ,n terms of me above examples =nvo~wng ;reafe =t suggests ..&quot;Je can use a s4ngle &quot;create&quot; ,,:3seframe to nandte .~11 the examples We Simply need to prowde suHable extensions to the existing caseframe nslantlatton algoNthms In the next section we present a detaded deszgn for such extensaons 2. Providing Plume wtth Syntactic Generality As descr=bed above. Plume can currently use clausal caseframes only to recognize s,ngle clause imperative and declaratwe utterances in the active voice. This section describes our design for extending Plume so that relative and interrogative uses of clausal caseframes in passive as well as active voice can also De recognized from the same information.</Paragraph> <Paragraph position="8"> We will present our general design by showing how it operates for the following &quot;create&quot; caseframe in the context Note tNat symbols in angle brackets represent non-terminals ,n a conmxt-free grammar (recogmzed by Plume using oattern matching techn,ques) In Ine caseframe defin,tlon above <create> matches all morDnologlcal vat=ants of the verio 'create&quot; ,ncluding &quot;create ' 'creates ' 'created&quot; and 'creating&quot; impugn not combound tenses +~ke .s :real,ng' see below). Using me ex,st=ng Plume :n,s .':ouid 3olv 9.1lOW uS tO recognize simple ~mperallves and actwe ~eclarat,ves</Paragraph> </Section> <Section position="3" start_page="155" end_page="156" type="metho"> <SectionTitle> 2. I Passives </SectionTitle> <Paragraph position="0"> Plume recogn,zes pasture sentences lhrough ~tS processing of the \]erO cluster +e the ma~n verb plus me sequence of modal and auxiliary .'erD ,mmedlalely preceding it. Once me main verb has been located a sl0ecsal verb cluster processing mechanvsm reads me verb cluster and determines from il whether me sentence ts acttve or passive 'j The parser records tills =nformaticn in a special case called &quot;%voice&quot;.</Paragraph> <Paragraph position="1"> If a sentence is found to be achve the standard parsing algor,hm described above ,s used If =t is found to be passive, the standard algorithm ~s used with the modification that the parser looks for the direct object or the indirect object ~deg in the subject positron, and for the subject as an optional marked case with the case marker &quot;by&quot;. Thus. given the &quot;create&quot; caseframe above, the follow,rig passive sentences could be handled as well as their active counterparts.</Paragraph> <Paragraph position="2"> The detailed design presented below allows Plume to use the &quot;create&quot; caseframe to parse nominals hke: the tile J~m crearecl on Monclav the person tna~ the tile was created oy on Monday the day on vvn~ch Jtm create(:/ tl~e hie TO do tins. we ~ntroduce the conceDt of a relative case A relative case is a link back from the caseframes for the objects that fill the cases of a clausal caseframe to mat clausal caseframe. A grammar preprocessor generates a relatwe case automatically from each case of a clausal caseframe, associating ,t 'Nlth the nominal caseframe .~at fills the case in me clausal caseframe. Relative cases rio not need to be spemfied by the grammar writer For instance, a relative case ,s generaled from the createe case of &quot;create&quot; and rnctuded in the &quot;hie&quot; caseframe. It lOOkS like this: Note thai :marker is the same as :header of &quot;create&quot; Similar relative cases are generated in the &quot;person&quot; caseframe for the creator case. and in the &quot;date&quot; caseframe for the creation-date case. differing only in :relative-case-name Relative cases are used s~mdarly to the ordinary marked cases of nominal caseframes. In essence, ff the parser ~s parsmg a non,nat caseframe ~nd finds the marker of one of ~ts relative cases, then it tries to instanhate the :relativecf It performs tms instantlatlon ~n the same way as ,f me relatwe.cf were a top-level clausal caseframe and the word that matched the header were ,is main verb. An ~mportan! d=fference ~s that it never tries to fill the case ,,,,nose name ~s g=ven by relative-case-name That case =s hlled by the nommal caseframe which contams the relative case For mstance, suppose the parser =s tryCng to process.</Paragraph> <Paragraph position="3"> 7&quot;he file J~m createcl on MonclaV And suppose that ~t has already located &quot;file ' and used that to determine ,t ,s ~nstanhat,ng a &quot;file&quot; nominal caseframe It ~s able to match {aga,nst 'created&quot;~ me * marker of the relative caseframe of &quot;hie' shown above. It then ~ries to ~nstanhate me relatwe.cf &quot;create&quot; using ~tS standard tecnmdues except real ~! does not try to fill createe the case of &quot;create&quot; specff=eo as the relallve-casename Th~s mstanr~at~on succeeds wllh &quot;Jim' gong =nip creator and &quot;on Monday&quot; bemg used to hll creatmn-date The parser then uses (a pomter to) the nommat caseframe currently being instant~ated. &quot;file&quot; to fill createe, the :relative-case-name case of &quot;create&quot; and the newly created instance of &quot;create&quot; is attached to this mstance of &quot;file&quot; as a modifier a b.</Paragraph> <Paragraph position="4"> ~t never looks any further left ,n the ~nout than the header of the nom=r'al caseframe or ,f ,t ~as already parsed any omer Oos'.-r~ommat cases of the nommal caseframe no further left than the r~ght hand end ot; them it COnsumes. but Otherwise ignores any relatwe pronouns iwno .,vn~;.m ~,.,n~n rr~ar ~ that ~mmediately precede the segment used to instantiate the relatwe-cf Tnlg ~neans rna~ 3/i words, including &quot;thar&quot; .~vdl ~e 3ccounrec #or ~n &quot;t/le file ttlat Jim createc .:.)t~ ~/lonclay&quot; it does not try to fill the case specified by the relative-case-name ~n the relative-of: =nstead tms case is filled by (a Oomter to) the Or~g=nal nommal caseframe tnstance: d. ff the relal=ve-case.name specifies a marked case rather than a positional one tn the relative.of then ~ts case marker can De consumed, but omerwtse ~gnored. durmg mstanhataon of me relatwe.cf This 3110w3 US tO deal wlln &quot;on ~n me .gate Jim created ~he hie on&quot; or &quot;the care un whlcn jim created the file '</Paragraph> </Section> <Section position="4" start_page="156" end_page="156" type="metho"> <SectionTitle> 3 Passwe relalave clauses (e g. &quot;Ihe file that was </SectionTitle> <Paragraph position="0"> created on Monday&quot;t can generally be handled using the same mechanisms used Ior passwes at the main clause level However tn relative clauses, passives may sometimes be recIucec/ by om~thng the usual auxihary verb to be (and the relat=ve pronoun) as ~n: the file create(l on Monday To account for such reduced relative clauses, the verb cluster processor will produce approonate additional readings of the verio clusters ,n relahve clauses for which the relative pronoun JS m~ssmg This may lead to multlOle oarses, mcludmg one for the above example s~mdar to the correct one for: the file Jot~n crea\[e~ on Monclay These amb=guaties wdl De taken care of by Plume s standard ambiguity reduction methods More comotetely. Plumes atgor~mm for relattve clauses ~s: 1. When processing a nommal caseframe. Plume scans for the ;markers of lhe rela{tve cases of the nominal caseframe at the same t~me as \[t scans for the regular case markers ol: that nominal caseframe 2. If it finds a marker of a relatwe case. ~t rues to inst~ilntlate the relaltve.cf lust as though if were the Top-level clausal case|tame and the header were ~ts mmn '/erb. ~.xcept mat:</Paragraph> </Section> <Section position="5" start_page="156" end_page="158" type="metho"> <SectionTitle> 2 \] interrogatives </SectionTitle> <Paragraph position="0"> in addmon to handling passaves 3no -eC/ahve :lauses.</Paragraph> <Paragraph position="1"> also wish {he =nformatlon ~n me &quot;c'eate -&quot;aseframe hanclle ~nterrogatlves tnvolvlng &quot;create' ~cn 3s ,re to ~1C Jim create me hl~. {~n MG;I,I\]V ' W,aS r/le /lie cre3teo OV J~m or} '.4L,&quot;,I.\]/~ ,/I/ho c.reare(~ the hie On ~f,unc,av ' What clay was the hie crejleC ,: The prtmary diffiCulty for Plume .,.,~ln mterrogatwes ~s that 3S these examoles ShOw me number of variations in stanclard COnStituent order is much greater than for tmperatives and dectaratJves. Interrogatives come in a w~de variety of forms. depending on whether the question is yes/no or wh: on which auxiliary verb ~s used: on whether the voice is active or passive: and for wh questions, on which case is queried. On the other hand. apart from var)ations in the order ancl placement of marked cases, there is only one standard constituent order for =mperatives and only two for declaratives (corresponding to active and passive voice). We have exl~lO=tecl th=s low variability by building knowledge of the imperative and declarative order into Plumes parsing algorithm. However this is impractical for the larger number of variations associalecl with interrogatives. Accordingly, we have designed a more data,driven approach This approach involves two Passes through the inpul: the first categorizes the input into one on several primary input categories incluOing yes-no questions, several kinds of whcluestions, statements, or ~mperat=ves. The second Pass performs a detaded parse of me input based on the ctassfficat=on made in the first Pass. The rules used contam bas=c syntactic ~nformat=on al3out Enghsn. and will rema,n constant for any of Plumes restricted domam grammars of semantic caseframes for Enghsh The first level of process=rig +nvolves an ordered set of r~D-/evel patterns. Each too.level pattern corresponds tO one of the primary =nput categor=es ment~onecl adore Th=s classificatory matchmg c~oes not attempt to match every +,vord +n the input sentence but only to do the ram=mum necessary to make the classdicat=on. Most of the relevant ,nformat~on is found at the beg=nnmg of the ~nDuts. In ioart=cular, the top-level patterns make use of the fronted aux=liary verb and wh-worcls tn questions.</Paragraph> <Paragraph position="2"> AS well as classffymg the input, th~s top-level match ,s also useci to determme the iclenttty of the caseframe To be =nstant=ated. Th=s =S =moortant to dO at this stage because the deta,led recognmon Ln the seconcl phase ts neav=ly de~enclent on the ~clent=ty of his top-level casetrame The special symbol. SverO. that appears exactly once =n all top-level patterns, matches a heacler of any clausal caseframe We call trte caseframe whose heacler is matcnecl by SverO the primary casetrame for that input.</Paragraph> <Paragraph position="3"> The second more detailed parsing phase is organized relative to the primary caseframe Associated with each top-level pattern, there is a corresponding parse femo/ate. A parse template specifies which parts of the primary caseframe will' be found in unusual positions and which parls the default parsing process (the one for declarat=ves and imperatives) can be used for.</Paragraph> <Paragraph position="4"> A simplified example of a top-level pattern for a yes-no question is: ~ <aux> (- ($verD !! <aux>)~ (&s SverOj Srest This top.level pattern w=ll match inputs hke. me followmg: D~ Jim create fop ~ Was fop creafecl Oy J~m ? The first element of the above top-level pattern ~s an auxiliary verlo, represented Dy me non-termmal <aux> Th~s auxdiary ~s remembered and used by the veto cluster processor (as though ~t were the first auxd~ary ~n the cluster) to determine tense and voice. AcCOrChng tO the next part of the pattern, some word that ts not a verb or an aux~hary must appear after the fronted auxdiary and before the mare verb ( is the negation operator, and !! marks a dislunction). Next. the scanmng operator &,~ tetls the hatcher to scan until it finds $vero which matches the header of any clausal caseframe F~nally. Srest matches the remaimng ~nDut.</Paragraph> <Paragraph position="5"> If the top-level pattern successfully matches. Plume uses the assoc~atecl Parse template to clirect ~ts more detaded processmg of the ~npul. The goal of this second pass through the input ~s to mstantiate the caseframe corresponding to the heacler matched by Sverlo in the top-level pattern, The concept of a kernel-casetrame is important to this stage of processmg. A kemel-caseframe Corresponcls to that part of an ~nput that can be processect according to the algorithm already budt into Plume for declarative and imperative Sentences, P Ihl fhl~ ~allern. .'~nly ii1OuIS wrlefe tl~e tronfecl auxlllarv .C/+ ,'he first worO ,~ rh~ sentence are alloweo t'he rrl()re &quot;,'+=nplex ~anerr; ~al ,s achJally .lsecI P)v PfLIIn~ dllc)ws ofeuu~lfiol)dll.~/ i~l,|fke 0 &quot;ases ',~ ionear i~lihaliv as ,,felt The parse template associated with the above top-level pattern for yes/no questions is: aux kernel-casetrame + (:query) This template tells the parser that the input consists of the auxiliary verb matched in the first pass followed by a :kernel-caseframe. For example. ~n: O;d J~m create fop ~ the auxtliary verb. &quot;did&quot; appears hrst followed by a kernelcaseframe. &quot;Jim create fop&quot; Note ~ow the kernel-caseframe looks exactly like a declarative sentence, and so can be parsed according to the usual declarative/imperative parsing algorithm In addition to spec:ficatJon of where to find components of the primary caseframe a parse lemplate ~ncludes annotations (indicated by a plus sign) in the above template for yes/no questions, there =S lust one annotatton ~uery. Some annotations, hke thiS one ,ndlcate what type of input has been found, while others direct the processing of the parse template. Annotations o! the first type record which case is being queried ~n wn questfons, mat ~s. which case ,s associated w,m the wh word. Wh questions thus include one of the following annotatTons SuOlect-query.</Paragraph> <Paragraph position="6"> Prelect-query. and mar~ea-case-que~ Marked case queries correspond to examples like: On what day d~d J~m create too deg What day d~d Jim create /oo on ~ in which a case marked by a preposition iS 13eing asked aPout. AS illustrated here me case-marker in such queries can either precede the wn word or appear somewhere .after the verO. To deal w;m this, me parse template for marked case quenes has the annotation tloa~na-case-marker. This annotation ~s of the second type thai ,s =t affects the way Plume processes the associated parse template.</Paragraph> <Paragraph position="7"> Some top-level patterns result ~n two poss=bdmlles for parse templates, For example, the follow=no top-level pattern < ,'/n.'NorO > < at.ix > i ( Sv~rto ii .-- at.ix > ~ $vf~rt~ $',f=.~t could match an ObleCt query or a marked case query, ~ncluding the following: What did Jsm create ~ By whom was fop created? sz Who was fop created Oy ? These ~nputs cannot be satisfactordy discriminated Oy a top-level pattern, so the above top-level pattern has twO different parse templates associated with it: wt~-ob/ect aux kemel-caseframe / (oOlecr.query~ wig-marked-case-tiller aux kernel-caseframe + (roamed-case-query float~ng-case-mar~er} .</Paragraph> <Paragraph position="8"> When the above top-level pattern matches. Plume tries to parse the input using both of these parse templates, in general, only one wil! succeed Ln accounting for all me input, so the amb~gudy wdl De eliminated by the methods already built ~nto Plume.</Paragraph> <Paragraph position="9"> The method of parsing interrogatives presented above allows Plume to handle a wide variety of interrogatwes ~n a very general way using domain specific semantic caseframes. The writer of the caseframes does not have to worry about whether they will ioe used for ~mperative. declarative, or interrogative sentences. (or in relatwve clauses). He is free to concentrafe on the domain-specific grammar. In addition. the concept of the kernel-caseframe allows Plume to use the same efficient caseframe-based parsing algorithm that =t used for declarative and imperative sentences to parse malor subparts of questions.</Paragraph> </Section> class="xml-element"></Paper>