File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/94/c94-2122_relat.xml

Size: 3,856 bytes

Last Modified: 2025-10-06 14:16:03

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-2122">
  <Title>Automatic Recognition of Verbal Polysemy</Title>
  <Section position="3" start_page="0" end_page="762" type="relat">
    <SectionTitle>
2 Related Work
</SectionTitle>
    <Paragraph position="0"> Although there have been several attempts to extract semantically similar words from a, given corpus, few studies seriously deal with the problenl of 1)olysemy; of these, even fewer are based on real texts.</Paragraph>
    <Paragraph position="1"> The techniques developed by Zernik \[Zernik, 1991\] and Brown \[Brown, 1991\] seem to cope with the discrimination of polysemy and 1)e ll~Lse(l on real texts. Zernik used monolingual texts which consist of about 1 nfillion words tagged by 1)art-of-spee(:h. I~Iis method associates ca(-h word se.nse of a polysemous woM with a set of its co-occurring words. If a word has sew eral senses, then the word is assoeiated with several different sets of co-occurring words, each of which corresponds to one of the senses of the word. The linfitation of Zernik's method, however, is that it solely relies on human intuition for identifying different senses of a word, i.e. the human editor tlas to determine, by her/his intuition, how many seilses a word has, and then identii~y the sets of co-occurring words (signat.lcres) that correspond to the different senses.</Paragraph>
    <Paragraph position="2"> Brown used bilingual texts, which consist of \]2 million words. The results of Brown's technique, when al)plied to a French-English nmchine transb~tion system, seems to show its eflbctiveness and validity. However, as he admits, the at)preach is linfited because it can only assign at most two senses to a word. More seriously, 1)olysemy is defined in terms of translation, i.e. only when a word is Lranslated into two different words in a target language, it is recogniscd as polysemous.</Paragraph>
    <Paragraph position="3"> The apllroach can bc used only when a large 1)aral lel corpus is awdhtble. Furthermore, individual senses thus identified (1(1 not neeessarily constitute single semantic units in the monolingual domain to which 1)lausible semantic prollertics (i.e. semantic rest;rictions,  colhlcations, etc.) can lie associated.</Paragraph>
    <Paragraph position="4"> The defects of these two methods show that it is crucial to have an N)pr()l)riate detinition of polyscmy in terms of distributimml 1)char|ours of words in mono-lingual texts. The approach proposed iu this paper focuses on this problem. Like Brown's apl)roach , our al)proach ad(lpts ;L rebttivist.ic vicw of polysclny. That is, ~ word is rccognised as l)olysenmus in terms of other relai.ed words..\[{owever, while Brown's al)l)roach idcntilies polysemous words in terms of rela~ted words of ram|her lmigui~gc, we. use semantically similar words of the same llmguage to identify polysemous words.</Paragraph>
    <Paragraph position="5"> Whether a word is polysemous or nol; depends on whether i~ set of other, semanti('Mly similar words exists whose distrilmtional 1)eh~viours correspond to it sitbset of the distributionM behaviour of the word.</Paragraph>
    <Paragraph position="6"> Because tile distributional beiu~viour of it word is character|seal 1)y its co-occurring words, the t)rocess of identifying such subsets essentially correslmuds to 1.he 1)rocess llcrformed manually by {:he hmnan edil.or in Zernik's approach.</Paragraph>
    <Paragraph position="7"> The experilmm~s in this p~ller use a corlluS &amp;llllOtal;ed only 1)y 1)art-ofsl)eech 1)ut not structurally annotltl;cd. Howev(% the clustering algoritlm b which m&gt; t(nna.ti(:ally recognises l)olysemous words, only ;~ssmnes that w(irds are semanl;ic~lly ch~tracterised by a vector ill a.n 't&gt;(linmltsional space so that i{: c;tn 1)e al)l)lie.d to any data sa.tisf'ying this condition.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML