File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/c94-2105_intro.xml

Size: 5,622 bytes

Last Modified: 2025-10-06 14:05:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-2105">
  <Title>Semantics WORD SENSE ACQUISITION FOR MULTILINGUAL TEXT INTERPRETATION *</Title>
  <Section position="2" start_page="0" end_page="665" type="intro">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> Text intcrprei;ation resca.rch has recelil, ly come I;o foctts eli data. exl, raction t;he probleui of producing strucl;lll'(!(\[ ilil'orr~i~l;ion f'l;Olil free l,eXf,&gt; usually {,o popitllt{c a. dal,~dmse. Oilce I;he key iiil()riual, ioll has lmeli cxlir~*ctcd, it C~ill \])(~ lln(!(l l,o help i~tlalyze tlic (',Olll;oni,s o\[' lm:ge volunies of Ix'.xt;s, (h:~cct lTencls, and retrieve selccl;ed informaliion. I)~ta exl;l'~ction is ~i, the center of the problem o\[ uuma.ging l~rge vollllllC8 of' toxl;.</Paragraph>
    <Paragraph position="1"> ()UF gF()llp lla,S led da.ta (,,xl;iJ~tctioii work \['or a lltll\[l1oer of ycal:s, dew;loping new ~rcllitcci;urcs and lexicons for uaLllra\] \]migltage proc('.ssing an(\[ t;(:si, ing thes('. reel;hods in a v;u:iety of ~q)plications \[,h~cobs aii( |l{,au, 1993; ,|~cohs, 1990; ;/~(:ol)s mid l{,au, 1990\]. lu the last two yore'n&gt; as lmrl; of the /\].S. gjOVCl&gt;tilliOllt&gt;s a I{,PA TII)8'I'I,;I{ progi;~un, wc have cxt,cnded this research I&gt;o halidle bro~t(lCl! doni0~his, wil,h higher a.OC, lll'a('.y, ;rod t,o I)ro(:(:SS l;exLs ill Ulll\]l;il)l(~ \[a, ligllag~(~s \[,lacol)s cl al., 1993\].</Paragraph>
    <Paragraph position="2"> The goal of processing texts hi a new l~uiguage is nol, only to show thai; I,he I)asic algorithnis ~tr(! language-indel)endent , but; ~dno Lo preserve as iilil('h kno.wh'.dgc a,s possible ~cross laAlgll3,gc8, and, wliere applicable, across doniains. For example, in mlapting an English system i,o handle Ja,paillese lcxl;n, it, is iUll)or\[,a.ul, that I;\]le ;lapancsc. sysi;(Hfl (:oli\[igllrat;ioH il;la, kes (ISC ils lnuch as possibh; o\[' 1,he general knowl edge, mid cw;n l, he English vo('.a, blilai~y, i,ll~ti, iJic sysl,ciri has. Thin tilaxili;lizcn t;lie l)(~r\['orltlalic.(~, a.li(\[ iuiliillliZos l;hc. a\[nol/lll; o\[ worl(, each l;ilne l;he sys\[,('.lli is applied i;o a new language.</Paragraph>
    <Paragraph position="3"> SI\[()GUN in unique in a, iliilnbcr o\[ ways, btll, it in pa.i:ticularly dist;inguinhed by l, he nharing of knowledge rcsollrces lit difl'erelll, languages. The appl:o~ch to lnlflt,illngu~d inl;erprel;ation involw;s two key elements: Fit:st, t, he. sysl, eni hichldes ~ core onl, ology of ~d)Otlt 1,000 concepl,s thag Sul)pori, word ncnnen hi l;he core English and '/apancse lexicons, which arc tdso identical in sl;ructure. Second, our systenl acquires much of its doinain-spcci\[ic kuowlcdge, including combinai, ions of words and phrases, Dora corpus data, casing the mapliing of word (:lass hdbrn-lation into a new language.</Paragraph>
    <Paragraph position="4"> l,'or example, Lhe \],\]liglish verb establish corl:f!sponds very closely to L\]lC. J;tl)aliese WOrd &lt;sc:lsizrilsu (1!~ ~;&lt;:).</Paragraph>
    <Paragraph position="5"> Ill the 'I'I\])S'I'I!',I{, domain of johll; Velitllres, both cslablisb mid .scls'a'rils'tt ~tr( ~, used to descril)e l;hc (:real.ion Of COlnl)allics ('%stalAisti a joinl; VCIILIlI?O\] &gt;), prodil(:tS (&amp;quot;csl,ablish a. l,eicconiinunicatiolls and dalia nei, work&amp;quot;), facilitien (&amp;quot;esi;~blish a. fi~ctory&amp;quot;), and other more abstr*t(:t conccpi, s (e.g. &amp;quot;esli;~biish a nl;ronger foothold hi Eul:opo'). 'l!he TIPS'FEI{ task, which requires disl;inct inforuiatioll for companies, facilities, activities, aud pro&lt;blots, makes ii; cl:ucial to disti nguish these dill re.toni; word usages regardless of langmtgc.</Paragraph>
    <Paragraph position="6"> SIIO(',UN's results on the final q'IPSTEI(, bench irlal'k couipal:ed very favor;d)ly 1,o l;iiosc of other sys t, ellls \[.lac.obs el aL, 1993\]. 'l'here are Illaliy ditfcrent ways Do view a.nd im~dyze lille li+l&amp;l/y dil\[7.~renl; I)crlc.h lnark sl,al;istics, bllt the area ill which SH OGUN's ~l)proa.ch was mosl, clcl-trly distinguished wan in recall the percent~ge of dal;~t 1&amp;quot;17Olfl cacti 1;esl, seA, l;\['lag was correctly extracted by l, he In'ogrmn. On thin mea.mlre, ,~1\[O(\] \[I N exl;r&amp;cl;e(t, on ;tverage, 37% more corl'ecl; informalion tium rely other system in rely conligm'atfion.</Paragraph>
    <Paragraph position="7"> SIIOGUN had somcwhag lower precision (13% lower ON aw~rage) titan the highest, prccisiou sysW, m in each configuration, meaning that S\[IOG tin also produced ~ somewh~d; \]m'ger a.tYiOilltl; o\[ irlcorrecl; information i,h~m other sysl,elns. The sysl,elfi&gt; in both |a.llgtl&amp;ges, oil;on idenllificd inform~fl, ion tha.l, w~ts noi; found 1)y ~41iy ottier nysi, cm, a FOnlli{ ~haL We al;l;ribul;e l.o }u~ving better coyel:age il/ its knowh!dge base l;han o~iloF sysl;Ol\[ln.</Paragraph>
    <Paragraph position="8"> *rl'lliS resem'ch w*Ls Sl)Oll~or()d ill pint. l)y the A(Ivitm:cd \]~.esearch Projects Agency (\])O\[)) &amp;rid el.her governincnt ~tgeneles. 'l'he views *rod COliCliiSiOiP; i:Otltailied ill this d, lcnnicnt are those of I;he &amp;ltt;itors ~tild should nol, be hll.erl)rel;ed ;ts represenl.hlg the oltlciM policies, eli.her expressed or imlfllcd, .f l.he advmiced l{ese&amp;i'cil \]~I'Ojec\[s Agency or l.hc \[J~ (low~i'lilllellL The rest, of this paper will describe tile problem o\[&amp;quot; multilingual interpretation a.s it; ~ppe~rs in the T\[P-STEI{ task, Lhen present: our sohttion, elnphas\]zing l,:nowledge st.t:ucl;urcs and knowledge ttcquisition.</Paragraph>
    <Paragraph position="10"/>
  </Section>
class="xml-element"></Paper>
Download Original XML