File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-2190_metho.xml
Size: 10,320 bytes
Last Modified: 2025-10-06 14:14:21
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-2190"> <Title>Prepositional Phrase Attachment Through A Hybrid Disambiguation Model</Title> <Section position="4" start_page="0" end_page="1070" type="metho"> <SectionTitle> 2 Using Multiple Information in </SectionTitle> <Paragraph position="0"> Disambiguation Like other work, we use fonr head words to make decision on PP attachment: the main verb v, the head noun (nl) ahead of the preposition (p), and the head noun (n2) of the object of the preposition. In the later discussion, the four head words are referred to as a quadrul)le (v nl p n2).</Paragraph> <Paragraph position="1"> Analyzing the strategies human beings employ in PP attachment disambiguation, we f(mnd that a wide-variety of information supplies important clues for disambiguation. It includes presuppositions, syntactic and lexical cues, collocations, syntactic and semantic restrictions, features of head words, conceptual relationships, and world knowledge. We use clues that are general and reliable 1PFTE stands for Parser for Free Text of English.</Paragraph> <Paragraph position="2"> PFTE system is a versatile parsing system in development which (:overs a wide range of phenomena in lexical, syntactic, semantic dimensions. It is designed as a linguistic tool for at)plications in text understanding, database generation fi'om text and computer-based language learning.</Paragraph> <Paragraph position="3"> so that they make the computation efficient and extensible. The information or clues we use are the following: 1. Syntactic or lexical cues. If nl is same as n2, for exalnple, often nlTPP is a fixed t)hr;~se su(:h as .step by step.</Paragraph> <Paragraph position="4"> 2. Co-oee'wr'rences. The (;o-o(:(:llrrences of triples and pairs in (v nl p n2) colne frmn annotated eorl)ora (Se(:tion 4).</Paragraph> <Paragraph position="5"> 3. Syntactic and semantic features. Features of v or nl n2 sometimes in(licate the &quot;corre(:t&quot; attachment. For examt)le,if v is a movement, p is to and n2 is a t)lace or direction, the PP teuds to be attached to the verb.</Paragraph> <Paragraph position="6"> 4. Conceptual relationships 1)etween v and n2, or between nl and n2. These relationships, which reflect the role-expections of the pre1)osition, sut)l)ly important chics for disambiguation. For example, in the sentence Peter broke the window by a ,stone, we are sure that the PP by a stone is att~u'hed to broke/v by knowing that stone~n2 is an instrument for broke/v.</Paragraph> <Paragraph position="7"> V~fe use co-occurrence informatioi~ in corl)ust)ased (lis;mfl)iguation and other information in rule-b,~sed disambiguation. Later, we will discuss how to ac(tuire above information and use it in disambiguation.</Paragraph> </Section> <Section position="5" start_page="1070" end_page="1070" type="metho"> <SectionTitle> 3 Estimation based on Corpora </SectionTitle> <Paragraph position="0"> In this section, we consider two kinds of PP attachment in our corlms-t)ased al)l)roaeh , nalnely, attachment to verb phrase (VP atta('lmmnt) and to nmm i)hrase (NP attachment). Here, we use two ammtated corpora: EDR English Corpus 2 and Susanne Corpus a to SUpl)ly training data.</Paragraph> <Paragraph position="1"> Both of theln (-(retain tagged syntactic structure for each sentence in thein. That is, each PP in the corl)ora has 1)een attached to an unique l)hrase.</Paragraph> <Paragraph position="2"> RA(v,nl,p,n2), a score fi'om 0 to 1, ix defined as a value of counts of VP attachments divided by the total of occurrences of (v,nl,1),n2) in the training data. 4</Paragraph> <Paragraph position="4"> In (1), the symbol f denotes frequency of a parti('ular tuple in the training data. For exami)le, is an amtotated corpus coml)risiltg about 130,000 words of written American English text.</Paragraph> <Paragraph position="5"> 'lWe assulue that only two kinds of PP atta(:hmerits: VP or NP attachment in the training data. f(vl) I share,apartment,with,friend) is the numl)er -of~.imes the quadruple (share, apartlnent, with, friend) is seelt with a VP attachment. Thus, we could choose a attaefiment actor(ling to RA score= if RA>0.5 choose VP attachment, otherwise choose NP attachment.</Paragraph> <Paragraph position="6"> Most of quadruples in test data are not in the training data, however. We thus turn to collect triples of (vd),nl),(nl,p,n2),(v,nl,l)) and 1)airs of (v,t)),(nl,p),(l),n2) like Collins and Brooks (1995) did, and coinpute RA score by (2) and (3).</Paragraph> <Paragraph position="8"> To avoi(l using very low frequen(:ies, we set two thr('sholds for each one above. For triplecombimttion, the c(mdition is:</Paragraph> <Paragraph position="10"> With the first threshohl in ca,oh case, we can avoid using low frequency tul)les; with the second one in each case, we throw away the RA score which is close to 0.5 ~Ls tlfis wdue is rather unstabh,. null</Paragraph> </Section> <Section position="6" start_page="1070" end_page="1071" type="metho"> <SectionTitle> 4 Conceptual Information and </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="1070" end_page="1071" type="sub_section"> <SectionTitle> Preference Rules </SectionTitle> <Paragraph position="0"> As we use only &quot;relial)le&quot; data from corl)ora to make decision on PP atta('hlnellt based (m RA score, many PPs' attachlnents may be left undetermined due to sparse. (l;tta. We deal these undete.rlnined PPs with a rule-based approach. Here we use preference rules to determine PP attachments 1)y judging features of head words and conceptual relationships among them. Tl,is information comes from a machine-readable dictionary</Paragraph> </Section> <Section position="2" start_page="1071" end_page="1071" type="sub_section"> <SectionTitle> 4.1 Features and Concept Classes </SectionTitle> <Paragraph position="0"> We cluster words (verbs or nouns) ~hi~h have s~une feature or syntactical function into a (:()ncel)t class. For examI)le, we classify verbs into active and passive, and ontologicM cbusses of mental, movement, etc. Similarly, we group nouns into place, time, state, direction, etc.</Paragraph> <Paragraph position="1"> We extract eoncel)t (:lass from concept classification in EDR Concept Dictionary} ~</Paragraph> </Section> <Section position="3" start_page="1071" end_page="1071" type="sub_section"> <SectionTitle> 4.2 Conceptual Relationship </SectionTitle> <Paragraph position="0"> Conceptual relationships between v and n2, or between nl and n2 predict PP attaehnlent quite well in many eases. We use EDR concept dictionary to acquire the concel)tual relationship between two concet)ts. For examt)le, given the two concet)ts of open and key, the dictionary will tell us that there may be a implement relationship 1)etween them, means that key may be act its an instrument for the action open.</Paragraph> </Section> <Section position="4" start_page="1071" end_page="1071" type="sub_section"> <SectionTitle> 4.3 Preference Rules </SectionTitle> <Paragraph position="0"> We introduce 1)reference rules to encode syntactic and lexical clues, as well a~s clues from conceptual information to determine PP attachments. We divide these rules into two categories: a rule whi('tl (:nit be applied to most of 1)rel)ositions is cMled global rule; a rule tying to a particular prel)osition, on the other hand, is called local rule. Four global rules used in our disambiguatioi: module are listed in Table 1.</Paragraph> <Paragraph position="1"> 1. lexical(passivized(v) + PP) AND prep C/ 'by' - > vp_attach(PP) 2. nl : n2 - > vi)_attaeh(nl + PP) 3. (prep # 'of' AND prep # 'for') AND (time(n2) OIl date(n2)) - > Vl)_attaeh(PP) 4. lexicM(Adjeetive + PP) - > adjp_attach(PP) Local rules use (:oncel)tual inforlnation to determine PP attachlnent. In Table 2, we show sample h)cal rules for preposition with.</Paragraph> <Paragraph position="2"> with-rules:</Paragraph> <Paragraph position="4"> On the left hand of each rule, a one-atonl pre(t-Concet)t Dictionm'y consists of al)out 400,000 con: cepts, where, fbr eolleet)t classification, related coneepts are orgmfized in hierm'chieM ar('hitecture and a concept in h)wer level inherits the f~atures from its upper level concepts.</Paragraph> <Paragraph position="5"> icate Oil the left hand presents tt subclass of concept ill the eon(:ept hierarchy (e.g. tilne(n2)), and a two-atom 1)redicate describes the COlWei)t relation between two at(nns (e.g. implennult(v,n2)).</Paragraph> <Paragraph position="6"> Since local rules emph)y the senses of head words (termed as concepts), we shouhl 1)roject each of v, ul and n2 used by rules into one or several coi|cepts which denote(s) &quot;correct&quot; word senses before apl)lying local rules. The process is described in (Wu and Furugori 1995).</Paragraph> </Section> </Section> <Section position="7" start_page="1071" end_page="1071" type="metho"> <SectionTitle> 5 Disambiguation Module </SectionTitle> <Paragraph position="0"> For each sentence with aml)igu<)us PP (both ill syntaeti(' and semantie;d level), PETE system will produ<'e ;t structure with unattached PP(s), and call the disambiguation 1nodule to resolve ambiguous PP(s). The algorithm used in the nn)(hfle is if RA(v,nl,1),n2)<0.5, then choose NP attachment otherwis(, choose VP attachment exit. } Phase 3. (concept-based disalnl)iguation): 1) Project each of v, nl, n2 into its COIteel)t sets. 2) Try the rules related to the prel)osition , if only one rule is applicable, use it to decide the attachment, and then exit.</Paragraph> <Paragraph position="1"> Phase 4. (attachment 1)y default): if f(p) > 0, then { if ~ < 0.5, then choose NP attachment; f(~,) otherwise choose VP attachment} otherwise choose NP attachment.</Paragraph> <Paragraph position="2"> This algorithm differs from the previous one de.scribed ill (Wu and Furugori 1995) in which preference rules were applied 1)efol'e statistical computing. We have changed the order for the following reasons: an experinlent has proven that using the data of qua(lrul)les and triples, as well as tut)les with high occurrences i,s good enough in success rate (Sec Tal)lc 3). and statistic models ha,ve a ground m~themlttical 1)asis.</Paragraph> </Section> class="xml-element"></Paper>