File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-1609_metho.xml

Size: 26,283 bytes

Last Modified: 2025-10-06 14:09:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1609">
  <Title>An Unsupervised Approach for Bootstrapping Arabic Sense Tagging</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Approach
</SectionTitle>
    <Paragraph position="0"> SALAAM exploits parallel corpora for sense annotation. The key intuition behind SALAAM is that when words in one language, L1, are translated into the same word in a second language, L2, then the L1 words are semantically similar. For example, when the English -- L1 -- words bank, brokerage, mortgage-lender translate into the Arabic --L2 -- word bnk (a0a2a1a4a3 ) in a parallel corpus,1 where the bank is polysemous, SALAAM discovers that the intended sense for the English word bank is the financial institution sense, not the geological formation sense, based on the fact that it is grouped with brokerage and mortgage-lender. Two fundamental observations are at the core of SALAAM:</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
a5 Translation Distinction Observation (TDO)
</SectionTitle>
    <Paragraph position="0"> Senses of ambiguous words in one language are often translated into distinct words in a second language.</Paragraph>
    <Paragraph position="1"> To exemplify TDO, we consider a sentence such as I walked by the bank, where the word bank is ambiguous with a6 senses. A translator may translate bank into Dfp (a7a9a8a11a10 ) corresponding to the GEOLOGICAL FORMATION sense or to bnk (a0a2a1a4a3 ) corresponding to the FI-NANCIAL INSTITUTION sense depending on the surrounding context of the given sentence.</Paragraph>
    <Paragraph position="2"> Essentially, translation has distinctly differentiated two of the possible senses of bank. a5 Foregrounding Observation (FGO) If two or more words are translated into the same word in a second language, then they often share some element of meaning.</Paragraph>
    <Paragraph position="3"> FGO may be expressed in quantifiable terms as follows: if several words a12a14a13a16a15a18a17a19a13a21a20a22a17 a5 a5 a5 a17a19a13a24a23a26a25 in L1 are translated into the same word form in L2, then a12a14a13a27a15a18a17a19a13a21a20a22a17 a5 a5 a5 a17a19a13a24a23a26a25 share some element of meaning which brings the corresponding relevant senses for each of these words to the foreground. For example, if the word Dfp (a7a9a8a4a10 ), in Arabic, translates in some instances in a corpus to shore and other instances to bank, then shore and bank share some meaning component that is highlighted by the fact that the translator chooses the same Arabic word for 1We use the Buckwalter transliteration scheme for the Arabic words in this paper. http://www.ldc.org/aramorph their translation. The word Dfp (a7a9a8a4a10 ), in this case, is referring to the concept of LAND BY WATER SIDE, thereby making the corresponding senses in the English words more salient. It is important to note that the foregrounded senses of bank and shore are not necessarily identical, but they are quantifiably the closest senses to one another among the various senses of both words.</Paragraph>
    <Paragraph position="4"> Given observations TDO and FGO, the crux of the SALAAM approach aims to quantifiably exploit the translator's implicit knowledge of sense representation cross-linguistically, in effect, reverse engineering a relevant part of the translation process.</Paragraph>
    <Paragraph position="5"> SALAAM's algorithm is as follows: a5 SALAAM expects a word aligned parallel corpus as input; a5 L1 words that translate into the same L2 word are grouped into clusters; a5 SALAAM identifies the appropriate senses for the words in those clusters based on the words senses' proximity in WordNet. The word sense proximity is measured in information theoretic terms based on an algorithm by Resnik (Resnik, 1999); a5 A sense selection criterion is applied to choose the appropriate sense label or set of sense labels for each word in the cluster; a5 The chosen sense tags for the words in the cluster are propagated back to their respective contexts in the parallel text. Simultaneously, SALAAM projects the propagated sense tags for L1 words onto their L2 corresponding translations.</Paragraph>
    <Paragraph position="6"> The focus of this paper is on the last point in the SALAAM algorithm, namely, the sense projection phase onto the L2 words in context. In this case, the L2 words are Arabic and the sense inventory is the English WordNet taxonomy. Using SALAAM we annotate Arabic words with their meaning definitions from the English WordNet taxonomy. We justify the usage of an English inventory on both empirical and theoretical grounds. Empirically, there are no automated sense inventories for Arabic; Furthermore, to our knowledge the existing MRDs for Arabic are mostly root based which introduces another layer of ambiguity into Arabic processing since Modern Standard Arabic text is rendered in a surface form relatively removed from the underlying root form. Theoretically, we subscribe to the premise that people share basic conceptual notions which are a consequence of shared human experience and perception regardless of their respective languages. This premise is supported by the fact that we have translations in the first place. Accordingly, basing the sense tagging of L2 words with corresponding L1 sense tags captures this very idea of shared meaning across languages and exploits it as a bridge to explicitly define and bootstrap sense tagging in L2, Arabic.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Evaluation
</SectionTitle>
    <Paragraph position="0"> In order to formally evaluate SALAAM for Arabic WSD, there are several intermediary steps.</Paragraph>
    <Paragraph position="1"> SALAAM requires a token aligned parallel corpus as input and a sense inventory for one of the languages of the parallel corpus. For evaluation purposes, we need a manually annotated gold standard set.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Gold Standard Set
</SectionTitle>
      <Paragraph position="0"> As mentioned above, there are no systems that perform Arabic WSD, therefore there exist no Arabic gold standard sets as such. Consequently, one needs to create a gold standard. Since SALAAM depends on parallel corpora, an English gold standard with projected sense tags onto corresponding Arabic words would serve as a good start. A desirable gold standard would be generic covering several domains, and would exist in translation to Arabic. Finding an appropriate English gold standard that satisfies both attributes is a challenge. One option is to create a gold standard based on an existing parallel corpus such as the Quran, the Bible or the UN proceedings. Such corpora are single domain corpora and/or their language is stylistic and distant from everyday Arabic; Moreover, the cost of creating a manual gold standard is daunting. Alternatively, the second option is to find an existing English gold standard that is diverse in its domain coverage and is clearly documented. Fortunately, the SENSEVAL2 exercises afford such sets.2 SENSE-VAL is a series of community-wide exercises that create a platform for researchers to evaluate their WSD systems on a myriad of languages using different techiques by constantly defining consistent standards and robust measures for WSD.</Paragraph>
      <Paragraph position="1"> Accordingly, the gold standard set used here is the set of 671 Arabic words corresponding to the correctly sense annotated English nouns from the  achieved a precision of 64.5% and recall of 53% on the English test set for that task. SALAAM ranks as the best unsupervised system when compared to state-of-the-art WSD systems on the same English task. The English All Words task requires the WSD system to sense tag every content word in an English language text.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Token Aligned Parallel Corpora
</SectionTitle>
      <Paragraph position="0"> The gold standard set corresponds to the test set in an unsupervised setting. Therefore the test set corpus is the SENSEVAL2 English All Words test corpus which comprises three articles from the Wall Street Journal discussing religious practice, medicine and education. The test corpus does not exist in Arabic. Due to the high expense of manually creating a parallel corpus, i.e. using human translators, we opt for automatic translation systems in a fashion similar to (Diab, 2000). To our knowledge there exist two off the shelf English Arabic Machine Translation (MT) systems: Tarjim and Almisbar.3 We use both MT systems to translate the test corpus into Arabic. We merge the outputs of both in an attempt to achieve more variability in translation as an approximation to human quality translation. The merging process is based on the assumption that the MT systems rely on different sources of knowledge, different dictionaries in the least, in their translation process.</Paragraph>
      <Paragraph position="1"> Fortunately, the MT systems produce sentence aligned parallel corpora.4 However, SALAAM expects token aligned parallel corpora. There are several token alignment programs available. We use the GIZA++ package which is based on the IBM Statistical MT models.5 Like most stochastic NLP applications, GIZA++ requires large amounts of data to produce reliable quality alignments. The test corpus is small comprising 242 lines only; Consequently, we augment the test corpus with several other corpora. The augmented corpora need to have similar attributes to the test corpus in genre and style. The chosen corpora and their relative sizes are listed in  The three augmenting corpora, BC-SV1, SV2LS and WSJ are translated into Arabic using both MT systems, AlMisbar and Tarjim. All the Arabic corpora are transliterated using the Buckwalter transliteration scheme and then tokenized. The corpora are finally token aligned using GIZA++. Figure 1 illustrates the first sentence of the SV2AW English test corpus with its translation into Arabic using AlMisbar MT system followed by its transliteration and tokenization, respectively.6 The art of change-ringing is peculiar to the English, and, like most English peculiarities, unintelligible to the rest of the world.</Paragraph>
      <Paragraph position="3"/>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Sense Inventory
</SectionTitle>
      <Paragraph position="0"> The gold standard set is annotated using the Word-Net taxonomy, WN1.7pre, for English. Like previous WordNet editions (Fellbaum, 1998), WN17pre is a computational semantic lexicon for English. It is rapidly becoming the community standard lexical resource for English since it is freely available for academic research. It is an enumerative lexicon in a Quillian style semantic network that combines the knowledge found in traditional dictionaries (Quillian, 1968). Words are represented as concepts, referred to as synsets, that are connected via different  synonymy, meronymy, antonymy, etc. Words are represented as their synsets in the lexicon. For example, the word bank has 10 synsets in WN17pre corresponding to 10 different senses. The concepts are organized taxonomically in a hierarchical structure with the more abstract or broader concepts at the top of the tree and the specific concepts toward the bottom of the tree. For instance, the concept FOOD is the hypernym of the concept FRUIT, for instance.</Paragraph>
      <Paragraph position="1"> Similar to previous WordNet taxonomies, WN17pre comprises four databases for the four major parts of speech in the English language: nouns, verbs, adjectives, and adverbs. The nouns database consists of 69K concepts and has a depth of 15 nodes. The nouns database is the richest of the 4 databases. Majority of concepts are connected via the IS-A identity relation. The focus of this paper is exclusively on nouns.7</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.4 Experiment and Metrics
</SectionTitle>
      <Paragraph position="0"> We conducted two experiments.</Paragraph>
      <Paragraph position="1">  In the first experiment a native speaker of Arabic with near native proficiency in English is asked to pick the appropriate meaning definition of an Arabic word -- given in its Arabic context sentence in which it appears in the corpus -- from the list of WN1.7pre definitions. They are allowed to pick more than one definition for each item. Or alternatively, the annotator has the option to choose NONE where none of the definitions is appropriate for the Arabic word given the Arabic context sentence; Or MISALIGNMENT where the Arabic word is not a translation of the English word whose meaning definitions appear in the list that follows, or it is simply a misalignment. The results from this experiment are illustrated in Table 2.</Paragraph>
      <Paragraph position="2">  SALAAM automatic annotations.</Paragraph>
      <Paragraph position="3"> It is worth noting the high agreement rate between the annotator and the SALAAM annotations 7SALAAM, however, has no inherent restriction on part of speech.</Paragraph>
      <Paragraph position="4"> which exceed a1a4a3 a9 . The only case that is considered a &amp;quot;NONE&amp;quot; category is for the word bit which is translated as the past tense of to bite as a0a2a1 . It should have been translated as a56a58a55a4a3a12a38 meaning a morsel/piece.</Paragraph>
      <Paragraph position="5">  In this experiment, the Arabic words annotated with English WN1.7pre tags are judged on a five point scale metric by three native speakers of Arabic with near native proficiency in English. The experiment is run in a form format on the web. The raters are asked to judge the accurateness of the chosen sense definition from a list of definitions associated with the translation of the Arabic word. The Arabic words are given to the raters in their respective context sentences. Therefore the task of the rater is to judge the appropriateness of the chosen English sense definition for the Arabic word given its context. S/he is required to pick a rating from a drop down menu for each of the data items. The five point scale is as follows: a5 Accurate: This choice indicates that the chosen sense definition is an appropriate meaning definition of the Arabic word.</Paragraph>
      <Paragraph position="6"> a5 Approximate: This choice indicates that the chosen sense definition is a good meaning definition for the Arabic word given the context yet there exists on the list of possible definitions a more appropriate sense definition.</Paragraph>
      <Paragraph position="7"> a5 Misalignment: This choice indicates that the Arabic word is not a translation of the English word due to a misalignment or the word being rendered in English in the Arabic sentence, i.e.</Paragraph>
      <Paragraph position="8"> the English word was not translated by either of the Arabic MT systems.</Paragraph>
      <Paragraph position="9"> a5 None: This choice indicates that none of the sense definitions listed is an appropriate sense definition for the Arabic word.</Paragraph>
      <Paragraph position="10"> a5 Wrong: This choice indicates that the chosen sense definition is the incorrect meaning definition for the Arabic word given its context.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3.5 Results
</SectionTitle>
    <Paragraph position="0"> Table 3 illustrates the obtained results from the three raters.</Paragraph>
    <Paragraph position="1"> The inter-rater agreement is at a high 96%. They all deemed on average more than 90% of the data items to be accurately tagged by SALAAM. The most variation seemed to be in assessing the APPROXIMATE category with Rater 1, R1, rating 19 items as APPROXIMATE and R2 rating 10 items  meaning definitions from the English WN1.7pre as APPROXIMATE and R3 rating 14 data items as APPROXIMATE.</Paragraph>
    <Paragraph position="2"> An example of a data item that is deemed APPROXIMATE by the three raters is for the word</Paragraph>
    <Paragraph position="4"> transliterated as tdq frqp jdydp klyA kl ywm fy twrnjtwn AlEZymp, edp mn AED' AltjmE which means In Great Torington, a brand new band plays everyday comprising members of the congregation The word AltjmE (a5a7a6 a28a9a8a8a6a8a1 ) is a translation of congregation which has the following sense definitions in WN1.7pre: a5 congregation: an assemblage of people or animals or things collected together; &amp;quot;a congregation of children pleaded for his autograph&amp;quot;; &amp;quot;a great congregation of birds flew over&amp;quot; a5 congregation, fold, faithful: a group of people who adhere to a common faith and habitually attend a given church a5 congregation, congregating: the act of congregating SALAAM favors the last meaning definition for congregation.</Paragraph>
    <Paragraph position="5"> An example of a MISALIGNMENT is illustrated in the following sentence:</Paragraph>
    <Paragraph position="7"> transliterated as Alqwlwn wAlr'p wsrTAn Alvdy Akvr AlA$kAl AlqAtlp llmrD...</Paragraph>
    <Paragraph position="8"> which is a translation of Cancer of the Colon, Breast and Lungs are the most deadly forms of the disease... The words srTAn (a49a52a33a9 a9a11a10 ), meaning cancer, and lungs were aligned leading to tagging the Arabic word with the sense tag for the English word lungs. Finally, the following is an example of a WRONG data item as deemed by the three raters. The definition for the word Alywm (a14 a3a5a26a8a6a8a1 ) in the following sentence:</Paragraph>
    <Paragraph position="10"> transliterated as yEy$ AlAxrwn Alywm fy mkAn Axr... which means The others live today in a different place... where the word equivalent to today is the target word with the following sense definitions: a5 today: the day that includes the present moment (as opposed to yesterday or tomorrow); &amp;quot;Today is beautiful&amp;quot;; &amp;quot;did you see today's newspaper?&amp;quot; a5 today: the present time or age; &amp;quot;the world of today&amp;quot;; &amp;quot;today we have computers&amp;quot; SALAAM chooses the first meaning definition while the raters seem to favor the second. None of the raters seemed to find data items that had no corresponding meaning definition in the given list of English meaning definitions. It is interesting to note that the single item considered a &amp;quot;NONE&amp;quot; category in experiment 1 was considered a misalignment by the three raters.</Paragraph>
    <Paragraph position="11"> If we calculate the average precision of the evaluated sense tagged Arabic words based on the total tagged English nouns of 1071 nouns in this test set, we obtain an absolute precision of 56.9% for Arabic sense tagging. It is worth noting that the average precision on the SENSEVAL2 English All Words Task for any of the unsupervised systems is in the lower 50% range.</Paragraph>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 General Discussion
</SectionTitle>
    <Paragraph position="0"> It is worth noting the high agreement level between the rating judgments of the three raters in experiment 2 and the human manual annotations of experiment 1. The obtained results are very encouraging indeed but it makes the implicit assumption that the English WordNet taxonomy is sufficient for meaning representation of the Arabic words used in this text. In this section, we discuss the quality of WN1.7pre as an appropriate sense inventory for the Arabic task.</Paragraph>
    <Paragraph position="1"> With that intent in mind, we evaluate the 600 word instances of Arabic that are deemed correctly tagged using the English WN17pre.8 We investigate three different aspects of the Arabic English correspondence: Arabic and English words are equivalent; Arabic words correspond to specific English senses; And English words do not sufficiently correspond to all possible senses for the Arabic word.</Paragraph>
    <Paragraph position="2"> The three aspects are discussed in detail below.</Paragraph>
    <Paragraph position="3"> a5 Arabic and English words are equivalent We observe that a majority of the ambiguous words in Arabic are also ambiguous in English in this test set; they preserve ambiguity in the same manner. In Arabic, 422 word tokens corresponding to 190 word types, are at the closest granularity level with their English correspondent;9 For instance, all the senses of care apply to its Arabic translation EnAyA (a7 a68a8a33a1a13a1 ); the sense definitions are listed as follows: - care, attention, aid, tending: the work of caring for or attending to someone or something; &amp;quot;no medical care was required&amp;quot;; &amp;quot;the old car needed constant attention&amp;quot; - caution, precaution, care, forethought: judiciousness in avoiding harm or danger; &amp;quot;he exercised caution in opening the door&amp;quot;; &amp;quot;he handled the vase with care&amp;quot;  are also senses for the Arabic word.</Paragraph>
    <Paragraph position="4"> - concern, care, fear: an anxious feeling; &amp;quot;care had aged him&amp;quot;; &amp;quot;they hushed it up out of fear of public reaction&amp;quot; - care: a cause for feeling concern; &amp;quot;his major care was the illness of his wife&amp;quot; - care, charge, tutelage, guardianship: attention and management implying responsibility for safety; &amp;quot;he is under the care of a physician&amp;quot; - care, maintenance, upkeep: activity involved in maintaining something in good working order; &amp;quot;he wrote the manual on car care&amp;quot; It is worth noting that the cases where ambiguity is preserved in English and Arabic are all cases where the polysemous word exhibits regular polysemy and/or metonymy. The instances where homonymy is preserved are borrowings from English. Metonymy is more pragmatic than regular polysemy (Cruse, 1986); for example, tea in English has the following metonymic sense from WN1.7pre: - a reception or party at which tea is served; &amp;quot;we met at the Dean's tea for newcomers&amp;quot; This sense of tea does not have a correspondent in the Arabic $Ay (a7a46a33a5 ). Yet, the English lamb has the metonymic sense of MEAT which exists in Arabic. Researchers building EuroWordNet have been able to devise a number of consistent metonymic relations that hold cross linguistically such as fabric/material, animal/food, building/organization (Vossen et al., 1999; Wim Peters and Wilks, 2001). In general, in Arabic, these defined classes seem to hold, however, the specific case of tea and party does not exist. In Arabic, the English sense would be expressed as a compound tea party or Hflp $Ay (a7a46a33a5 a56a58a27 a8a1a0 ). a5 Arabic word equivalent to specific English sense(s) In this evaluation set, there are 138 instances where the Arabic word is equivalent to a subsense(s) of the corresponding English word. The 138 instances correspond to 87 word types. An example is illustrated by the noun ceiling in English.</Paragraph>
    <Paragraph position="5"> - ceiling: the overhead upper surface of a room; &amp;quot;he hated painting the ceiling&amp;quot; - ceiling: (meteorology) altitude of the lowest layer of clouds - ceiling, cap: an upper limit on what is allowed: &amp;quot;they established a cap for prices&amp;quot; - ceiling: maximum altitude at which a plane can fly (under specified conditions) The correct sense tag assigned by SALAAM to ceiling in English is the first sense, which is correct for the Arabic translation sqf (a2 a59 a10 ). Yet, the other 3 senses are not correct translations for the Arabic word. For instance, the second sense definition would be translated as a3 rtfAE (a4 a33a8 a44a13a12a12a51 ) and the last sense definition would be rendered in Arabic as Elw (a3a5a27 a1 ). This phenomenon of Arabic words corresponding to specific English senses and not others is particularly dominant where the English word is homonymic. By definition, homonymy is when two independent concepts share the same orthographic form, in most cases, by historical accident. Homonymy is typically preserved between languages that share common origins or in cases of cross-linguistic borrowings. Owing to the family distance between English and Arabic, polysemous words in Arabic rarely preserve homonymy.</Paragraph>
    <Paragraph position="6"> a5 English word equivalent to specific Arabic sense 40 instances, corresponding to 20 type words in Arabic, are manually classified as more generic concepts than their English counterparts. For these cases, the Arabic word is more polysemous than the English word. For example, the English noun experience possesses three senses in WN17pre as listed below. - experience: the accumulation of knowledge or skill that results from direct participation in events or activities; &amp;quot;a man of experience&amp;quot;; &amp;quot;experience is the best teacher&amp;quot; - experience: the content of direct observation or participation in an event; &amp;quot;he had a religious experience&amp;quot;; &amp;quot;he recalled the experience vividly&amp;quot; - experience: an event as apprehended; &amp;quot;a surprising experience&amp;quot;; &amp;quot;that painful experience certainly got our attention&amp;quot; All three senses are appropriate meanings of the equivalent Arabic word tjrbp (a56 a3 a9a65a28a7a44 ) but they do not include the SCIENTIFIC EXPERI-MENT sense covered by the Arabic word. From the above points, we find that 63.9% of the ambiguous Arabic word types evaluated are conceptually equivalent to their ambiguous English translations. This finding is consistent with the observation of EuroWordNet builders. Vossen, Peters, and Gonzalo (1999) find that approximately 4455% of ambiguous words in Spanish, Dutch and Italian have relatively high overlaps in concept and the sense packaging of polysemous words (Vossen et al., 1999). 29.3% of the ambiguous Arabic words correspond to specific senses of their English translations and 6.7% of the Arabic words are more generic than their English correspondents.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML