File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-2088_metho.xml
Size: 20,434 bytes
Last Modified: 2025-10-06 14:13:00
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-2088"> <Title>Lexi(:al Knowledge Acquisition ti'om Bilingual Col'por&</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. Semantic disambiguation </SectionTitle> <Paragraph position="0"> The verb &quot;~)~l~ ~&quot; in tile Japanese sentence is a typical Japanese polyserny. This verb has six subentries in a Japanese dictionary that has about 70,000 entries, and ten English equivalent verbs ( &quot;hang&quot;, &quot;spend&quot;, &quot;play&quot;, etc.) in a Japanese-English dictionary that has about 50,000 entries.</Paragraph> <Paragraph position="1"> So, it is not easy to associate the surface word &quot;~qJ'~&quot; with its exact meaning. Ilowever, with the translation examl)le , the corresponding English verb such ms &quot;hang&quot; helps to find the meanirrg of the Japanese verb &quot;7~19 ~,5''.</Paragraph> <Paragraph position="2"> In this paper, we propose a method for resolving the syntactic ambiguities of translation examples in bilingual corpora and a method for acquiring lexical knowledge, such as case frames of verbs and attribute sets of nouns. In our framework, first a pair of sentences of both languages are syntactically analyzedtand translated into feature descriptions, which represent dependency structures of the pbrases in the sentences. Although feature descriptions are generated by grarnmatical knowledge only, they are quite suitable to represent case frames of verbs. Then these feature descriptions of the two languages are compared, or unified, using knowledge about word equivalence from bilingual dictionaries. In this matching process, one word in the English sentence could be eqnivalent to several words in the translated Japanese 1Tbe Japanese morphological analyT~r lm.s 14 part of apeech and about 36,000 words. The Englisb dictionary contains about 55,DO0 words. The current Japanese and English granamar~ consist of 85 DCG rules aald 135 DCG rul~-s.</Paragraph> <Paragraph position="3"> sentence. Also one word in the Japanese sentence could be equivalent to several words in the translated English sentence. In order to realize the matching process between two languages including these several word equivalence cases, we introduce a unification algorithm based on sets of compatible pairs of atomic values and feature labels in Chapter 2.</Paragraph> <Paragraph position="4"> In Chapter 3, we statistically evaluated the process of syntactic disambiguation. The success ratio of disambiguation is about 63~68 % for translation exampies in a Japanese-English dictionary. At present, we have already collected about 50,000 translation exampies from a machine readable Japanese-English dictionary (Kodansha Japanese-English Dictionary \[10\]) and an English learners' textbook. We have extracted case frames for several verbs as a simple experiment.</Paragraph> <Paragraph position="5"> The results are described in Chapter 4.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Unification of Feature De- </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> scriptions of Two Languages 2.1 Unification based on Sets of Com- </SectionTitle> <Paragraph position="0"> patible Pairs of Features and Values null In our framework of sentence analysis, a sentence in each language is parsed and translated into feature descriptions, which represent dependency structures of the phrases in the sentence. Ill this section, we basically use and extend Kasper and Rounds' notation of feature description logic (FDL \[6\]) to describe our unification algorithm of feature descriptions, except that we don't use path equivalence.</Paragraph> <Paragraph position="1"> When unifying feature descriptions of two languages, knowledge about word equivalence taken from bilingual dictionaries is used to decide whether all atomic value of one language is compatible with an atomic value of the other language. This is also the casc with feature labels. Knowledge about word equivalence from bilingual dictionaries can be regarded as knowledge about compatibility of atomic values and feature labels of feature descriptions.</Paragraph> <Paragraph position="2"> From this standpoint, we introdnce a unification algoritlHn based on sets of compatible pairs of atomic values and feature labels.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Data Structure </SectionTitle> <Paragraph position="0"> Let A and L be sets of symbols used to denote atomic values and feature labels. Let CA and CL be sets of compatible pairs of atomic values and feature labels.</Paragraph> <Paragraph position="1"> That is, (/A is the set of pairs of atomic values such as (ai,aj)(al, aj ~_ A), where al and u i are consistent and mfifiable, and Ct. is the set of pairs of feature labels like {li,lj)(li,lj C L), where li and lj are consistent AC't'ES DE COLING-92, NANTES, 23-28 AOt~rr 1992 5 8 2 PROC. OF COLING-92, NAN'rES, AUG. 23-28, 1992 and unitlable 2'3.</Paragraph> <Paragraph position="2"> The syntax for formulas of the FDL with Sets of Compatible Pairs (FDLC) is given below.</Paragraph> <Paragraph position="3"> NIL denoting no information TOP denoting inconsistent information a where a E A, to describe atolnic values (ai, aj) where ai, aj E A and (ctl, aj) E CA, to describe pairs of atomic values 1 : C/ where I E L and C/ E FI)I,C, to describe structures in which the feature labeled by / ha.s a value described by C/ (li,lj) : C/ whereli,b (5 L and (l,,Ij) C- CL attd C/ (: FDLC, to describe structures in which the feature labeled by (li, Ij) hmu a value described by C/ C/ A C/ where C/, ~b G FDLC</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Unification Algorithm </SectionTitle> <Paragraph position="0"> Because of the compatibility scts, there is not necessarily a unique most general unifier of two feature descriptions. When applying this algorithm to unify fe.aturc descriptions between two languages, we col lect all possible unified feature descriptions and lind the most overlapping Ulfifier by a scoring function, which is introdneed later. The following detinition of UNIFY returns one possible unified feature description. We collect all possible nnitied feature descriptions. null Function UNIFY(f ,g) returns one possible unified feature description: where f attd g are featur)e descmptions.</Paragraph> <Paragraph position="1"> 1. If f =NIL, then return g 2. Else if g = N1L, then return f 3. Else if f = TOP or g = TOP, then return &quot;1'01&quot; 4. Else iff, gEAtJCA and f--9 then return f(: g) 5. Else if f,g E A, if (f,g) G CA, tt ...... t ..... (f,g) else return TOP end.</Paragraph> <Paragraph position="2"> 6. Else if f = 1 : a I attd g = l : u s, and IE LUG'L, if( alg := )UNIFY(a:,a~), then return I : al9 else return &quot;FOP end.</Paragraph> <Paragraph position="3"> ~These compatibility sets do not necessarily define equiv alence relations of atomic vtdu~ and feature labels, i.e., ttley do not satisfy the trmmitive ~ld symmetric laws. They race rellexive, and (a,a) a~td (l,l) are identified ~s a and 1. a In fact, in the case of tile tulificatlon of feature descriptions of two languages, ai of (ai, aj)(~ CA) is an atomic value of ol~e language and a) is aa atomic value of the other lmlguage. This is also the case with I i gild 13 of (It, 1~)(~ CI. ), 7. Else iff=l!:a! and g=l u:at, and (11, lg) (~ CL and ( aI~ := )UNIFY(ay,a~), then return (I),, lg) : aI~ 8. Elseiff=flAf2 and ( ..~ h, f,., g~ ~t, := )UNIFY-CONJ(f,g) and ( h .... )UNIFY(f,,g,), then return h A h~ 9. Else if g = 9a A g2, then return UNIFY(g, f) 1{), Else return f A g eltd.</Paragraph> <Paragraph position="4"> Function UNIFY-CONJ(f,g) retnrns one possihle 34uple of feature descriptions << h, fr, gr ~-': where f and g are feature descriptions, and h is a unified feature description, and fr,gr are r~t parts of f,g that are not used to generate h.</Paragraph> <Paragraph position="5"> 1. if f -- f, A f~, ( .~ h, f,, g,. ) :=)UNIFY-CONJ(f~, g) and return ~ h, f,. A f~, g~ Y~ or ( ~ h,f,-,9, ~ :=)UNIFY-CONJ(f~,.q) and return ~ h, fl A fr,g,- Y~&quot; 2. Elscifg=glAg2 and ( 42 h,g~,f~ ?~t,:-:)UNII&quot;Y-CONJ(g,f) then return ,~ h, f~, 9,&quot; 3. Else ( h :~ )UNWY(f, 9) and return ,( h, NIL, NIL ~t, cud,</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Unification of Feature Descrip- </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> Set of Compatible Pairs of Atomic Values Knowledge about word equivalence is extracted from bilingual dictionaries m order to construct CA. First, for each word in the English sentence, equivalent Japanese words are extracted from English-Japanese dictionaries, and for each word in the Japanese sentence, equivalent English words are extracte.d from Japanese-English dictionaries 4, Using this knowledge, any possible pairs of equivalent cotttent words s that are included in the original sentences are collected, and CAD, the set of these equivalent (i.e.</Paragraph> <Paragraph position="3"> coml)atible ) word pairs, is constructed. Then for all other content words WND~, s in the English ~ntenee and WN1)Jap in the Japanese sentence, any possible pairs (WN:)C/~g, WNDiap) are collected, which comprise CAN9. Finally, CA is defined ms CA:) U CAND.</Paragraph> <Paragraph position="4"> In the case of Example 2, CA:), CAN:) and CA are shown below. CA~9 and CAND are constructed only for the content words, so ill this ease CaN9 is ~ (an empty set).</Paragraph> <Paragraph position="6"> Set of Compatible Pairs of Feature Labels In our framework of unification between two languages, we assmne that the set of compatible pairs of feature labels, CL, is constructed based on statistical data. That is, each feature label pair (li,lj) in CL has a probability plj(O < Pij <_ 1) calculated from statistical data. This Pij represents the probability that the semantic role of feature Ii in a specific feature description of one lamguage is the same as that of feature l.i ill another specific feature description of the other language. For exaurple, for a specific English Japanese verb pair (write, ~- ~ ), the feature label pair (sub j, C/)C/ ) is ,~ssumed to have a probability P,ubL ~&quot; And for anotlmr English--Japanese verb pair (read, ~t2 ), ttle feature label pair (subj, :b C/ ) is assmned to have another probahility qsubj, h'.</Paragraph> <Paragraph position="7"> Since we are at the starting point of our project of lexical knowledge acquisition, we initially assign 1 to tire probability of each feature label pair, except head of a phrase, such ms i1o1111$ and verbs.</Paragraph> <Paragraph position="8"> for pairs that are known not to have ttle same case role from some grammatical knowledge. These exceptional pairs are not contained ill CL, i.e., tlmir probabilities are 0. In fact, for the purpose of lexical knowledge acquisition, it is sufficient to assume the probability as 1 or 0, because we need credible results for extracting lexical knowledge about the usages of words.</Paragraph> <Paragraph position="9"> The Most Overlapplng Unifier The scoring function SCOR.E(h) calculates the validity of a unified feature description h. This function returns a 2-tnple of real numbers s, (xl,x2) (xl,x2 E R(set of real numbers)), where xl is the number of word pairs extracted from bilingual dictionaries and contained ill the unified feature description, on the other hand x~ is tile number of word pairs aLso contained in the unified feature description but not extracted from bilingual dictionaries. More precisely, xl corresponds to tile number of word pairs (Wo,~9 , WDjop) in the unified feature description that are elements of CAD, and x~ corresponds to the number of word pairs (WND~,s, WNOj,p) in the unified feature description that are elements of CAN D .</Paragraph> <Paragraph position="10"> The order among scores is defined as follows: {xt,x2) is greater tban (Yl,~) iff. xl >yl or (xt =yt,x2 >y2) The most overlapping unifiers are the ones with the greatest score. The complete definition of the scoring function is given below.</Paragraph> <Paragraph position="11"> Function SCORE(h) returns (xl, x~) (xl, x2 (5 R(set of real numbers)): where h is a unified feature description.</Paragraph> <Paragraph position="12"> 1. If h E CAJg, then return (1, 0) 2. Else if h E CAND, then return (0, 1) 3. Else ifh=l:a whereICLuCz and a E A U Ca and SCORE(a) = (x,,x2), then return (scortEL(1) x ~l,SCOltE~(t) x ~) 4. Else if h = hi A h~ where hi, h2 E FDI,C and SCORE(h~)= (2:11 , ZI2 ) and SCORE(h~)= ( ....... ), then return (xll + x2~, zl2 -t- x22) 5. Else return (0,0) end.</Paragraph> <Paragraph position="13"> Function SCOREL(I) returns the probability of l: wherelc LUCL 1. If I E L, then return 1 2. If/E CL, then return the probability of I eSince the probability of a feature label pair is l or 0, Xl and x 2 ate integers at pre~ellt.</Paragraph> <Paragraph position="14"> ACTrS n~: COLING-92. NANTES, 23-28 AO~r 1992 5 8 4 PROC. ol: COLING-92, NANTEs, AUG. 23-28. 1992</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Example </SectionTitle> <Paragraph position="0"> The results of unification and scoring of Example 2 are as below.</Paragraph> <Paragraph position="2"> Tt~e prepositional phrase &quot;with a pencil&quot; modifies the verb &quot;wrote&quot; m the upper feature description.</Paragraph> <Paragraph position="3"> The score of tile upper feature description is greater than that of tile lower one. So in this ease, the upper one is regarded as tile correct ease frame example for tile pair (write, ~&quot; < ).</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Syntactic Disambiguation: </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Experiment and Evaluation </SectionTitle> <Paragraph position="0"> in order to evaluate how well syntactic ambiguities of translation examples are resolved, we made all experiment of syntactic disambiguation using 189 translation examples extracted from a J apanese-English dictionary. Firstly, each sentence of a translation exampie is syntactically analyzed and translated into feature descriptions. For 44 translation examples, syntactic analysis of tile Japanese or English sentence is faile.d. For those which are successfully analyzed, the average number of feature descriptions generated from one scntcncc is 4.4 for Japanese and 17.1 for English. Secondly, these feature descriptions are unified.</Paragraph> <Paragraph position="1"> After this process of syntactic disamhiguation, from 86 translation examples, a uniquc ee~sc framc of the unified verb pair of Japanese and English is acquired.</Paragraph> <Paragraph position="2"> Calculating from this result, the success ratio of acquiring unified case frames of verbs, (the number of translation examples such that a unique unified case frame of verbs is acquired from each translation exampie)/ (tile uumher of translation examples such that each sentence is successfully analyzed), is 86/145 = 59.3%. And the success ratio of syntactic disambiguation, (tile number of sentences such that a unique ease frame of the verb is acquircd from more than one feature descriptions)/ (tile number of sentences such that more than one feature descriptions are originally generated), is 70/103 = 68.0% for Japanese, and 84/133 = 63.2% for English.</Paragraph> </Section> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Lexical Knowledge Acquisi- </SectionTitle> <Paragraph position="0"> tion of Verbs</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Acquiring Case Frames of Verbs </SectionTitle> <Paragraph position="0"> As described ill 2.2, a feature description unified between English aud Japanese is as below.</Paragraph> <Paragraph position="1"> pred : (write, ~ < ) tense : past obj C/~ pred : (letter ~5.\]~ ) F pred <p,,,ea, ) I ( h -c ) / .wtt , &quot; : L spec : a J This feature description tells that tile verbal concept represented hy tile pair of the English verb &quot;tv~te&quot; and the Japane~qe verb &quot;~ <&quot; have at least three eases that are marked by some syntactic information mid some surface functional words such as (subj, *2 ), (obj, ~ ), (with, T' ). it also tells that each case takes a certain nominal coueept represented by tile pair of English and Japanese words, such as U, *h >, <fetter, ~;:~ ), (pe,leit, ~ ). Once a large amount of this kind of data is collected, statistical data ahout case frames of verbs eaal he extracted, making use of a thesaurus of nominM concepts 7. In the remainder of this section, we will illustrate a general procedure for acquiring case frames of verbs. Lct us start with a collection of a large amount of unified feature descriptions like above for a specific Japanesc verb V~. Suppose that we want to get possible case frames of this verb. By a case frame, we mean something tikc a feature description for this verb, consisting of surface cases each of which is marked hy a postpositional particlc p~ and some specific semantic categories taken from a thesaurus like BGI\[. Usually, a verh has several distinct case frames. However, it is not easy to extract those case frames automatically only from the collected unified feature descriptions.</Paragraph> <Paragraph position="2"> So the system finds critical points to distinguish possible case frames for a verh using some heuristics, then it asks tile human instructor whether the distinctions of ease frames arc correct. These heuristics and human interactions arc smmnarized as follows.</Paragraph> <Paragraph position="3"> layered abstraction hierarchy mrd more t|mat 60,OOO words are assigned at the leaves. At the presettt stage, it is ntot certain whether this the~sautim is reliable enouglt for our initial research target of acquiring case frames of verbs. It is, however~ the most precise and broad coveri|kg 3apsmeae thesaurus obtahtable for us, currently.</Paragraph> <Paragraph position="4"> First, collect the nouns marked by pj in a feature description of the verb Vj from the set of unified feature descriptions. Then mark each collected noun in the thesaurus. If the most specific common layer of the marked nouns is low enough, then we assume that the case marked by pj takes a noun of the semantic category that corresponds to that layer. But if the most specific common layer is higher than a predetermined layer s, the information provided by that layer is too general for tile semantic categories of the case marked by pj. For instance, it is quite rare that both an animate concept and an abstract concept can be the subject of a certain verb. Such a case strongly suggests that the verb has at least two distinct conceptual meanings or two distinct case frames.</Paragraph> <Paragraph position="5"> It then becomes necessary to classify the marked nouns in the thesaurus.</Paragraph> </Section> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. Bilingual Intersection of Concepts </SectionTitle> <Paragraph position="0"> Some of the heuristics come from the advantages of bilingual intersection of concepts, which we have already shows in Chapter 1 as semantic disambiguation. For a Japanese verb Vj and its case marked by a postpositional particle p j, suppose that unified feature descriptions such as \[ pred:(VEl,Vj) , (IEI,pj):{NE1,NJI) \] and \[ pred:(VE2 , Vj), (IE2,pJ):(NE2,NJ2) \] are ohtained. Both of these two feature descriptions have a feature label pj for Vj. llowever, if VE1 and V~2 are different verbs or IEl and IE2 are different feature labels, these two feature descriptions may be classified into different case frames of the verb Vj.</Paragraph> </Section> class="xml-element"></Paper>