File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-2140_metho.xml
Size: 22,929 bytes
Last Modified: 2025-10-06 14:12:13
<?xml version="1.0" standalone="yes"?> <Paper uid="C88-2140"> <Title>g~ ~he X~ltc=~.-PS'aC/~t:.i~}~:x of Syrrta~c and Semantics ir~ a S-T--nta.~<tica, i.\]:y Gxai~led (;~tsef.~-ame .Parser</Title> <Section position="4" start_page="677" end_page="678" type="metho"> <SectionTitle> 2. Lantuatd~ Specl.f:i.<:; )bc, peCtS </SectionTitle> <Paragraph position="0"> Using Ge:cms)n as input language to our interface calls fore solutior~s to p:c'obloms which de not ari.s<~ io:m the F, nglish \].anguag,a. The most prominent differences are: - there is a :v:i..eh mo:,pho:l.ogy, -. constituent order at the clause level in fairly free, and -. there :i.s the v~rb-sec, ond phenomenon in main clauses o Morpholoty is dealt with in tile mo:cphc)logical componerrt of the scanner /Tz'ost ai~.d Do~ffne~ 1987/. Th:i..s scanBe:c pas~;t-_',s :!.nfc, xmai;ion abotrt case markers ( a.~; we\].:l, as o-ibex&quot; ;~wntactic features) i;o the; pax.'~ec, but -- if interpz'eted locally ... this i.nfo:C/mat:t.on :!.s usually highly ambiguous o AS for word order, ther{{ ar<~ basieal\]}Z two phrase 'types in German: noun-dependent phrases, like no(:~n phrase ( NP ) and prepositional phrase ( !:'P ), with a :cathez&quot; ~isid word oz'de~r, and eZ,~uso--like phz'asss, Ni.ke se, ntenco (S) and adJective phrase (AP)~ with at best a prefer:cad ordez-in 9 of constituentSo Fo~&quot; a discussion of wo~d order J.~ G~z'~nan of,HoC/~hle /1982/ and~ fo~< ~ n~oro &quot;computationallv oriented view U~;zko:r-PS1PS :f t ~ /i98{,/ and Ilauenschlld /.I.986/~ Closer inspect:Lon shows that o~\] th~ ca~o hu.ad part of the NP,~ namely APe embedded :in the~% exhibit free constftuent order~ whoreav~ on the othez' hand c\].ause.-like constz'uc fiords appear to have one fixed position: tt).,, h~ad (vez-bal complex and adj act.lye respaegJv(:<\[y) which always comes :Last. The:,-o :J.;::~ the o~!j( exeept:Jon that in main clauses the :I.~,.:if:tect~:M part of the verbal com\[31s3g f~love~q to f3k-ll.3ol-~/! position /Haider 1.984/~, In parsing a language llke Oezma~ ,,~ke thsrefore needs two different (co~)t;cadi~;;::L~-wi.) sirsreties : ,. ene for the fixed werd order' of a:cqume~ts~ inside constituen-ts ( i o e o determine); ~:l att:r:/bute of NPs) -- one fe:c the free constituent o~:dez' of the axguments and modifiera~ o+- predJ.<~a%o~\[; ( i0 e.etls censt.t.tue~H:s of g ), 0u!: solution to thJ.:3 pcoblem i:J i:he iuteract.ion el:' two different teehn.Lqu~u :\]u our parssr~ For processin 9 constituents with fixed word order we chose the Aug i oxltod Transition Network ( A'fN ) formalism / B al <~.s 1978/~ because ATNs are a wall undGrstooC/i algorithm with very effieient imp:l-ementatior; techniques awailab\]_e~ and they prov:Lde fo:c a relatively transparen-c notationdeg Si~'~s~ w~; use the ATN only for a part of tile synl:ast:l.C/~ parsing which itself inteK'acts closely with semantics, the known weaknesses inherent to ATNs do not pose a problem in the centext of our parser~ For free.-order eonstit:uerFt~ on &quot;\[:he other hand we use a uni:\[ieation-basod ~;t:rategy whioll makes heavy use of a ca:~e fz'ante matdher., We will first desc~':Lbe both ,~omponents in some detail and &quot;,*:he~ de~aoi~strato how they J.nte:cacl;o Our ATN consists of the ~isual subnsts fo~' phraseo..types (NP, AP, ' ~, etc. ). in con't~'ast to i:he standard approach it works on a chart of mox pholo~ical entries created by -th~ morphological component mentioned ea~'li.~x.&quot; o This chart may contain ambiguities which the ATN is extended to cope withdeg Since the ATN aims at the con~3t~'uct.lo~-~ of funotional dependencies (an argum~nt/nlodifi~r - head structux'e)which is greatly eased by knowing the head /Proudlan and Pollard 1985/~ we decided to nee head-driven analysis :in &quot;th~ ATNo German Ms basically a sub, eat-oh.Jest .~ verb (SOV) language, that means the head of a phrase comes last wi~h few exc~ptJ.ons, ThC/_;~:;e exceptions are : - NPs may have posl;modlfi<-~rs (~enltive NP:~,, PPs, zslative clauses)~ J,~. PPS &quot;the propositio~z comes i~ the fi:C/st po~.~it.lon ,.</Paragraph> <Paragraph position="1"> * the above men~tio:o.~d vo~b~,~co.~-~d pheJnome~o~i~ in ~aain clausesdeg ~qith a slightly different view on ~ih\[c~H~ ~laTacture all three of -th~s~ ~x~kJ% io~ disappea~'o Let' s for the moment just a~anm~ * that the head a\].wa~s com~s in %h(r) Zaat position o Then it proves advantageous i;(c) choose a xi~hto.to--:i.eft orde:c fo~ prooes~H..~~ s~nt~noos o There are ~-3evera\]. int ez~C/~ PS~,t i~%~ {~ca~,':~e)ql~ence,,:~ of this decision~&quot; there i~ no need for a separate PI-<,-u~t~bn~t~</Paragraph> <Paragraph position="3"> t:b\[~ ph:case ~- l,':~ simply viewed as a semantic c~e marke:c.</Paragraph> <Paragraph position="4"> ,-~ adJu~<;t~ to the x'igl-lt of a phrase head have to be parsed separatelydeg In our case: Postmod~fiers like PPs, genitive NPs and ~:olativ~ clauses modifying NPs a~o not inclllded in the NP-.-subnet. Since postmodJfier attachment cannot be performed well using local information only, this pairs r.~icely with our strategy of handling th~ a~?9ament/modi fief attachment on the casefram~ level and thereby reducing ambiguity for the ATNo .. in mai~~ clauses (where the verb--second movemeni: /Haidex 1988/ applies ) 'this movemer~li has to be undone to have the (~omple%e verbal complex as the head of the sel~tenco J.n the last position. This has anoth~c advantage~ Although word order is d:\[ f forent in main clauses and dependent ~lause~ on the surface, after this ret~.&quot;an~ fo~mation the same subnet can be used for all different sentence types, and the same is true for the subnet fox' the vez'bal comp\]~ex o Adopting the grammar in tile way Just described leads to the desired situation ~he:PS~e for every phrase type. the head comes I a s t o 4. PS'.as~ef~:~ue~ and ~:he Cage Frame Matcher Casef~:'ame~ represent both a semantic and a syntactic representation of a phrase. The ~enmntic ,zontent is given by a ' semantic' prediea't~ and the functional dependencies and meanings of its arguments, and further red, frictions by modifiers (if any) .</Paragraph> <Paragraph position="5"> The very idea of rep~;esenting semantic dependencies in form of case frame~ goes back to tile work of Fillmore /1968/, whereas ideas ol~ the additional syntactic and functional structure we use can be traced back to Chomsky' ,a /1981/ Theta--rules and Bresnan' S /1982/ functional structures and in the - a syntactic restriction (SYN) - a semantic ~estrietion (SEM) o- a filler (VALUE) Ce~of~'am~s are instantiated, from the lexicon ai~d Infox',~iation is added during the analysis ~:~f ~'abphza~eso To do so the~e is at least o~:~ so.-oalled &quot;meaning&quot; attached to the ie~:Lc;al entry of each verb, * noun and ~C\]t:~c'tive. A meaning consists of a pointer to a oasefz-amo plus eventual modifiers to be applied to the caseframe at the time of ins~%a~rtie.t,ton o The instantiation process o.~'oa%es ne w edges in the chart, representing ,th~se pa~'tJ.ally filled caseframeSo The Case l,'~ame Matcher (CFM) wo~ks on that chart, which as passed on to it by the ATN. This d~art consists only of those caseframes '.celevant to the CFM to construct the now casefram~ ~ Other parts, like the mo:cphologfeal oha~t or already constructed caseframes outside the seeps of tile ph:c~a~ actually conside~:ed x:emain invisible to ito One or more of the caseframes in the chart passed to the CFM are marked a~J pz'ospe<rtiw~ heads, and the otrtput of the CFWI :L~'~ a new caseframe (oz&quot; more than one .in ca~e of ambiguity ) spanning the whole ch~r t w:i th several slots filleddeg VALENCY slots may be filled if: -. syntactic restrictions are met, -. semantic restrictions are met, -- other restrictions stemming eategorN of the met.</Paragraph> <Paragraph position="6"> and from the head (o.g.adjaconcy) are The syntactic restrictions a:ce met i\[f t,h~ feature\[~ of the SYN-~slot arid SYNTAX of the filler caseframe ean be unifieddeg ',~h<~ restrit:tions given are usually (>n <:ategory, case, preposition, etc. But they need nc, t be given explicitly in all cases. One can make use of a number of structural case,.~ like SUBJ ( subject ) and DOBJ ( direct obj ect ) Transformations can apply to these ea~e~\] under&quot; Get-rain circumstances and <~'og~t~'ansfo~-m DOBJ into SUBJ in case of passivedeg Tile realization of tile structural c a'~ c, :i .~\] evaluated at the i:ime of slot fi fling, depending on the catego:cy of the head.</Paragraph> <Paragraph position="7"> Only if a restriction is stated explicitly Lf is taken as it standsdeg But sh~-uc;tura\] ca.~Je:: like eog. SUBJ get different interp~-'etations: for an S (sentence) a nominal:lye NP with nu,nber agreement with the head is sought, fo:c an AP SUBJ has to be the head of the governing NP, agreein 9 in case, gerldc,,r ~md number, and for an NP SUBJ is realized as a genitive NP Or a PP with the pcepos/.tion ~von' .</Paragraph> <Paragraph position="8"> Thi,~ way great flexibility is gained and iI: is possible to reduee the lexicon and the meanings stored ~herein to the essentia!~3o It is even possJ.}3\] e \[:o p-,'o C e \['.~J nominalizattons using the meaning of i;iJe corresponding 'verb.</Paragraph> <Paragraph position="9"> Tho semantic \[cestz'ictions to be met aide, gLw.~h by a hierarchy of predicates. Slim arLd the predicate of the filler caseframe mu,~\]t be compatible to allow slot fillingo gimJlar considerations apply to the construction of modifiers : syntactic and semanhJ.c compatibility must be givendeg</Paragraph> </Section> <Section position="5" start_page="678" end_page="679" type="metho"> <SectionTitle> 5. Interaction </SectionTitle> <Paragraph position="0"> Generally speaking, the topelogioal regularities of phrases are handled by the ATN, whereas free word order constituents are being taken care of by the unification process~ This unification proces~ works on ~ local chart created by the A'I'N, comprlsin 9 only those parts of the sentence relevant to it~ Thus various island phenomena fa\]! out from the conception of the parser., Flow of control between the ATN and the othe~ components is organized in a way p~:eposed by Boguraev /1979/o The ATN starts p'coeessi.~Lc~ a sentence in the usual way~ Afte:c rocognizJ.n<\] a phrase boundary by reachin U a POP az'<~ control is given either directly to the CFM Or the unification process o Th~ process evoked serves as a test for the ?~OP arc,</Paragraph> <Paragraph position="2"> In constituents (with strict word order) the CFM is invoked directly and tries to build up a caseffame (or more than one in case of ambiguity)deg The result is returned to the ATN which makes use of it during further processingdeg In structures with free constituent order (clauses) the ATN acts solely as a collector The words are first processed mo~phologically and a chart is returned, rendering a canonical form for each of the words together with word class and syntactic information (edegg. case markers). At this level, some ambiguities a~ise, eogo that of &quot;welche&quot; which might be an interrogative pronoun or a relative one, and &quot;die&quot; which may be an article or e relative pronoun.</Paragraph> <Paragraph position="3"> Welct\]e ,, Wien aez~el\]t fueP die Pr'odtJl<tion t)enoetigte Stoffe von Fir'men aus dem Ausland ? Fig.2: Morphological chart of constituents. Constituent caseframes are merely stored in a local chart and attachment is postponeddeg The only constituent recognized topologically is the head which always comes in the last positiondeg This chart of constituents Ks then given to the unification process when the POP arc is reached. In addition to relying heavily on the CFM, the unificator also has various strategies at its disposal in order to take into consideration restrictions of adjacency and category dependent of the category of the phrase processeddeg This way possible syntactic ambiguity is m~duced and almost no backtracking is needed inside the ATNo Generally, information passed to the CFM is collected while traversing the subnet: head caseframes are instantiated, arguments and modifiers are collected by pushing the appropriate subnets and morphological and/or syntactic clues trigger various informations on the caseframesdeg AS an example we mention the passive * transformation: if evidence for passive is gathered while analyzing the verbal complex (for S) or a participle (for APs), this information is passed on to the CFM. The CFM then applies the passive transformation to the relevant slots of the head caseframe before the slot filling takes place. These transformations are one way to take general syntactic information away from the lexicon (the caseframes) to reduce redundancy /Hayes et ai.1985/.</Paragraph> </Section> <Section position="6" start_page="679" end_page="679" type="metho"> <SectionTitle> 6. An Annotated Example </SectionTitle> <Paragraph position="0"> To demonstrate how the system works, we will conclude the paper by giving an annotated example of a parse. For the sake of clarity some of the details shall be simplified, but all of the essentials will be properly described.</Paragraph> <Paragraph position="1"> We have chosen the following example sentence: &quot;Welche yon unseren Abteilungen in Wien bezieht fuer die Produktion benoetigte Stoffe von Firmen aus dem Ausland?&quot; (&quot;Which of our Viennese departments gets materials necessary for production purposes from abroadT&quot;) Please note that the free translation does not capture the grammatical subtleties involved in the original sentence;especially the adjective phrase &quot;fuer die Produktion ber.oet~gte Stoffe&quot; includes a passivizatlon that tis usually not expressed this way in</Paragraph> <Paragraph position="3"> There is a simple global control structure which works on this morphological chartdeg Its main task is to transfer control to ATN networks for phrase-like constituents and %o the unlficator for clause-like constituentsdeg The control structure starts by transferring control to the PP/NP-ATNo The chart entry for &quot;Ausland&quot; is treated first ( remember the right-to-left direction of processing), i~ is found to be a noun, and the next edge, DET, is processed. The third word, &quot;aus&quot;, finishes the PP/NP. Control is transferz'ed to the caseframe marcher (CFM). The caseframe for the head, &quot;Ausland&quot;, becomes instantiated, and the features of the other components are unified with it, especially the feature of dative, which is derived from the determiner.</Paragraph> <Paragraph position="4"> After completion of this caseframe, control is transferred back to the PP/NP net which processes &quot;yon Firmen&quot; in a similar waydeg The CFM is called again, constructing another caseframe~ According to our strategy, PP attachment will not be performed at thls step, instead all the constituents will be collected firstdeg The PP/NP ATN gets its next chancedeg It treats the chart entry for &quot;Stoffe ~ which makes a perfectly suitable head for a more complex constituent. We start to anticipate this when the next word, &quot;benoetigte&quot; (&quot;necessary&quot; - albeit not an adjectives but a PPP in German), is processed. In general, inflected PPPs trigger a PUSH AP, so does this one. (Uninflected PPPs form part of the verb complex). Next, a PUSH PP/NP is performed which will lead to a constituent embedded in the APe Hut let's see this in detail. The PP is processed similar to %he others before, the head &quot;Produktion&quot; becoming instantiated and the caseframe filled after the entry for &quot;fuer&quot; has been processed. This finishes the AP, since the verb, &quot;bezieht&quot;, definitely cannot be part of an AP. As you may remember, APs t~Igger th~ unification component which in turn calls the CFM to handle the simpler tasks. Thus, th~ head of the AP, &quot;benoetlgte&quot;, becomes instantlatedo The associated caeeframe lu presented below:</Paragraph> </Section> <Section position="7" start_page="679" end_page="679" type="metho"> <SectionTitle> (BENOETIG (SYN SUBJ) (SEM ORGANIZATIONAL_UNIT) (SYN DOBJ) (SEM MATERIAL) (SYN PPOBJ (FUER)) (SEM PURPOSE)) </SectionTitle> <Paragraph position="0"> Before the caseframe will be filled~ a passive transformation is applied, due to the fact that the example sentence contained the verb &quot;benoetigen&quot; in its PPP re,me This transformation simply changes SUBJ to of :tl,iL: t~ansformat:iion will not turn out in this s'te~p, but in %he next one~ when the PP/NP w~th th(~ head &quot;Stoffe&quot; will hav~ been flnished, l~ut let ' s stick to the correct ordex:. Th(~ caseframe of &quot;be,xoe'tigen&quot; has b~e~ in~;'tantiated and t_~.'ansfoz'med, and it is abol~t 'to be filleddeg Normally, the unificater wt:ll ~aOW dome into its own, having to decide for pro~er attachments. In this case, thez'e is on:ty one constituent lef%. at this level, PURPOSe: slot, so it :is placed there. The AP ~,ow has helen finished~ and POP PP/NP J~ the next ~dg~ &quot;t O be takendeg Let us take a little digression~ Suppose the PP/NP ~'fuer die Produktion&quot; would not have fit into a slot of the PPP. If we had taken &quot;gefaerbte&quot; (&quot;dyed&quot;) instead of &quot;benoetigte&quot; 'this wo%~id do. In this case we would not get the ree~ding &quot;materials dyed for production puz'poses&quot; but instead two components, dyed materials&quot; and &quot;for production pu~'poses&quot; o The sophisticated reader could argue that 'the first reading might also be correct. The argument here is that the oaseframes in our systems are constructed .in a wa~ to fit the</Paragraph> </Section> <Section position="8" start_page="679" end_page="681" type="metho"> <SectionTitle> PP/NP I </SectionTitle> <Paragraph position="0"> ......... :IE L Pl::tO~ ........</Paragraph> <Paragraph position="1"> needs of the domain modelleddeg In ou~ &quot; domain, this reading would not be appropriate, so we d-_i.d not provide a caseframe for it, thus excluding a theoretical ambiguity where in the practical application thez&quot;e is nonedeg As the slot filling fails, the AP..ATN will.</Paragraph> <Paragraph position="2"> backtrack. We get an AP consisting of just one single word (&quot;gefaerbte&quot;) filling a slot in &quot;Stoffe&quot;, making up for one PP/NP and another PP/NP, namely &quot;fuer die Produktion&quot;. These two PP/NPs will be collected at this stage of processing and only attached when all of -the sentence will have been parsed.</Paragraph> <Paragraph position="3"> We will stop our digression here a**d come back to the original exampledeg Remember, the AP has ,lust been flnished and the PP/NP with the head &quot;Stoffe&quot; is POPpeddeg This means a transfer of control to the CFM (in PP/NPs the CFM is called di~ectly, whereas in an AP or S tile unlficator is called first in o~der to :find correct attachmentsdeg Afterwards, the uniflcator in turn calls the CFM to realize * the selected attachments ). The AI? is integrated into the PP/NP caseframe as a modifier predicate in the MOD sloto The SUBJ slot of the subordinated caseframe (the one of &quot;benoetigen&quot; ) is still unfilled. For syntactic reasons, its filler must be the head of the superordinated PP/NP &quot;Stoffe&quot;o The semantic restriction of the SUBJ slot is MATERIAL which is compatible with the noun &quot;Stoffe&quot;, so the slot may be filled (note that SUBJ is the transformed syntactic /'estriction which had been DOBJ before the passive transformation had taken place ) o Thus, a third constituent has been added to the pool of collected constituentsdeg The global control structure ce*rtinues by processing the next entry, the representation of the word &quot;bezieht&quot;, which is a finite verb and has to be at the second position according to German grammardeg It is set aside for later processing and a special state is entered, knowing that exactly one constituent has been left overdeg The PP/NP &quot;in Wien&quot; is processed, and a corresponding casef~ame is created.</Paragraph> <Paragraph position="4"> Similarly, a caseframe for &quot;welche yon unseren Abteilungen&quot; is created and &quot;in Wien&quot; is attached to it when the unificator applies its knowledge that there cannot be more than one constituent in this position. This way, possible ambiguities e.gdeg trying to fill &quot;in Wien&quot; into a slot at sentence level, are By this time w~ have finished our way f~:'o~ rigi~t to left th~7ough the momphologic;~I chart and have co13 outed many components (PP/NP\[-~ and the predicate) at tile sentence ievel o The global control structure passes contx'ol to the urlificator which has to find cO:c'~'<~c% attachment and to pez'form the slot filling at the sentence level o Casefi:'ame Ins tantlation takes place, building a frame for the vez'b</Paragraph> </Section> <Section position="9" start_page="681" end_page="681" type="metho"> <SectionTitle> (SEM ORGANIZAT,UNIT)) </SectionTitle> <Paragraph position="0"> Next, all possible attachments are soughtdeg Two conditions have to hoid for them~ adjacency and semantic cempatibility~ PP/NP4 e. g o cannot be attached to an V other constituent, because it is adjacent on:Ly to the main verbdeg Therefore, this constituent has to fill a slot in &quot;beziehen&quot;o Foi: the remainJng PP/NPs thOre exist different possibllities o Let us denote subordinat ~ on by the hyphen characte:c0 From the adjacency point of view, the possibilities are:</Paragraph> <Paragraph position="2"> slot in the &quot;beziehen&quot; caseframe which matches the syntax of PP/NPI (preposition &quot;aus&quot;), nor would there be semantic compatibilitydeg 3 is the reading we p~efer.</Paragraph> <Paragraph position="3"> As for' 4, its acceptability depends on whether we allow a slot in ~he caseframs for &quot;Stof fe&quot; which could hold an ORGANIZATIONAL UNIT. If we do, we will get an ambiguity.-- In that case, the system will offer both solutions, using a heuristic which of the solutions to present firstdeg The heuristic implemented prefers flat syntactic structuresdeg AS for' the preferred reading, the CFM realizes it by filling PP/NP3 into the DOBJ slot and (PP/NP2 -. PP/NP1) into the PPOBJ slot of the caseframe for &quot;bezlehen&quot;o PP/NP4 has already been filled in the SUBJ slot, se the parse of the sentence has been completeddeg</Paragraph> </Section> class="xml-element"></Paper>