File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/94/c94-1006_abstr.xml
Size: 6,169 bytes
Last Modified: 2025-10-06 13:47:59
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1006"> <Title>Two Methods for Learning ALT-J/E \]Y=anslation Rules from Examples and a Semantic Hierarchy llussein Ahnuallim ln\[o. and Coml)uter Science Dept. King Fahd University of l)etroleum and Minerals l)hahran 312(;1, Sated( lXrahia</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> A critical issue in AI r(~sem'ch is to ov(.'r(:(ml(~ the knowh~(Ige acquisition bottleneck in knowl(!dge-tms(!d systems. As a knowledge base is eXlmn(led, adding more kn((wl(~dg(-` and fixing previ(ms err(m(~(Tus kn()w1edge become increasingly c(Tstly. Mor(~(Tv(w, maintaining the integrity of Ire'go knowledge bases has 17rovcn to be a very chall(mging task.</Paragraph> <Paragraph position="1"> A wid(!ly im)i)(Tsed apl)roach t() deal with the knowl(~dg(~ a(:quisiti(m botth.uu~(:k is to employ some lcai'ning lll(-`ch}llliSi\[l to (~Xtl'}lct th(~ (\[csir(!d kni)wledge autornati(:a\]ly or semi-automatically from a(:tual (:ases (Tr examl)h!s \[lhmhamm & \Vilkins \]993\].</Paragraph> <Paragraph position="2"> The validity of this apiTroa('h is 17ec(TminI', m()re ew (dent as vari(ms machin,~-learning,-l)ased l~u()wh,(lg(' acquisitioi~ tools for real--world domains are l)(,i(l~,, report(-`d \[Kim & Moldovan 1993, l)orter ~t al. 199t), Sato 1991a, 5;at(7 19!)ll7, Ul:sur(7 (!t al. 1992, Wilkins 1990\].</Paragraph> <Paragraph position="3"> AIJI'-.J/I:'~, whi(:h is an exp(!rim(mtal Japan('s(!-English translation system d(~v(.qoped art Nipp(m 'lbh!gral)h and T(~lel)hon('. Corporation (NTT), is (me examI)le of a larg(! knowh~dg(>l,ased system in which solutions t(7 the l~n()wle(lg(~ a('(luisiti(m l)(Tttl(m(~ck are delinit(~ly need(:d. ()he major (:(Tmi)on(mt of this system is its huge (:oll('(:tion of trm~sl,~ti(m. t'ltlcs. Each of these rules associates a .\]alTmlCSC s(,\[lten('(' I);d,t(Tn with an aI)I)roI)riat(-` l,'mglish pattern. To translat(: a Japanese s(~ltt('.iic( ~, into l';nglish, AI;I'-,I/I'; hiol~s lbr the rul(~ whose ,\]almli(!s(! i);ttt(wil llHttch(!N t}l(! S(!ILt(!II('(! best, and then uses the English \])~-ttt(~l'l |O\[' thatt rule for translation.</Paragraph> <Paragraph position="4"> So far, AI;F-J/E translation ruh!s have b(!en composed mam(ally by (~xtensiv(~ly trained human exl)('rts. T(7 qualify lln&quot; this.i(~b, an eXl~ert must not only master both English and .lapanes(~ but also be very familiar with various comi)onents of the system. Each tinm the rules are (~xi)anded or altc.r(-`d, the new set of rules must then I)c &quot;delmgg(~d&quot; using a c(711ecthm of t(.~I. ('as(,s. Usually, s('vcral it(~ri~tions are n(~cded t(7 arrive at translation rules (Tf acceptalflc quality.</Paragraph> <Paragraph position="5"> Creating new translation rules as well as refining existing ones have In'OVen to lm cxtr(~mely difficult ;~ltid tiHl(~-COllSll(liill~ l)(?(:a/iSC thcsc t~(.sks r(~(l(lil'(! col(~ sidering a huge space ()f p(Tssibh~ comlTimtti(ms (rules in AI;\['-,I/E at(! (~xpr(.'ssed in terms of as much as 3000 &quot;semantic categorieF'). The high costs involved make the mmmal creation of ALT-.I/E's translation rules impractical, hMeed, in si)ite of the w~st mnount ()f r(,sources sp,mt ,)n building th(-` current ruh!s of A LT-J/I!', faults in these rules are still d(~tected fi'om time t() tim(.', making system l\[(kl.illt(!Ilatiic(~ it c(mtinu-Oils 1&quot;(!(I 11\] F(!(ll(!l It.</Paragraph> <Paragraph position="6"> 'I'h(-` aim ()f this work is to mak(! AUI.'-J/I,;'s tnmsla(.ion rubes less costly and more rcliabh-` through tim us(! ()t' inductive machi,l(' h'a,',lin/,; techni(lueS. Car(!ful examinati,)n (Tf th(, mamml pr(7(:(~ss wlfich has been t271lmv,'d so far by Al;l'-,l/l';'s (~Xl)erts fin&quot; Imihling t:ranslati(m ruDs revc'ids that m(Tst of th(.' efl',n't is spent on figuring out the (:onditi(m part of the rules (that is, the 3apanesl~ i(att(~rns). Ther(~fore, we prol)OSC th(; (is(.' of indu(:tiv(~ machine learning algorithms t(( h~mn these conditions fi'onl examph~s of Japanese sentences and their English translations. Under this machine l('arning approach, the user is r(qi(wed from exph)ring th(! hug(: space of alt(~rmttives sl(e/hc, has to con.sider wh(m c(mstrnctinl,; translation rules manually from scratch- a job whi(:h only ext(msiv(!ly train(!d eXlT(wts can perf(n'm. Th(' task is now tin'ned into a s('ar('h tl)r s()m(~ r('as(Tnahh-` rules that explain t.lm given training cxamlTles , whbrc the search is han(lh-`d aut(mmti('ally by a learning algorithm. This not only sltves the tlser~s tiltl(}~ hilt idso lltakes it untle(:t!ssary for the user to be an expert of the AUI'-J/E system. Mor(~ver, this approa(:h sigmticantly reduces the &quot;subjectivity&quot; of the rules since the interwmtion of hmnlm exI)erts is minimized. This is tmrticularly important because tile iHllllense Illllllb(w of translation rules (currently over 10,000) requires employing a team of experts over an extended l)eriod of tim(!.</Paragraph> <Paragraph position="7"> Two learning methods are investigated in this i ml)er. Ext)eriments show that the rnles learned by these methods are very close to the rules mmmally COmliosed by hlllIt}tll experts. Ill Hl(Ist cases~ givell a reasonabh~ mtmber of training examph~s, th(! employed methods are able to find rules that are more than 90% accurate when compared to the mamutlly COnlI)OSed miles.</Paragraph> <Paragraph position="8"> The rest of this document is organized as ti)llows.</Paragraph> <Paragraph position="9"> We begin in Section 2 by it brief overview of the AUI'-J/E Japanese-l.;nglish translation system. In Section 3, we discuss some of the 1)rol)lems that arise when the translation rules of ALT-J/E are composed manually })y }roman experts. Then, we t)ropose in Section 4 an alternative approach based on machine learning techniques. In Section 5, we describe the inductive learning methods used, followed by an experimental ewfluation of these methods in Section 6. Fimdly, conclusion remarks are stated in Section 7.</Paragraph> </Section> class="xml-element"></Paper>