File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/c04-1053_metho.xml
Size: 19,962 bytes
Last Modified: 2025-10-06 14:08:40
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1053"> <Title>Evaluating Cross-Language Annotation Transfer in the MultiSemCor Corpus</Title> <Section position="5" start_page="1" end_page="1" type="metho"> <SectionTitle> 3 Quality Issues </SectionTitle> <Paragraph position="0"> The MultiSemCor project raises a number of theoretical and practical issues. For instance: is translational language fully representative of the general use of language in the same way as original language is? To what extent are the lexica of different languages comparable? These theoretical issues have already been presented in (Pianta and Bentivogli, 2003) and will not be discussed here. In the following, we address the issue of the quality of the annotation resulting from the application of the methodology.</Paragraph> <Paragraph position="1"> As opposed to automatic word sense disambiguation tasks, the MultiSemCor project specifically aims at producing manual-quality annotated data. Therefore, a potential risk which needs to be faced is represented by the possible degradation of the Italian annotation quality through the various steps of the annotation transfer procedure. A number of factors must be taken into account. First, annotation errors can be found in the original English texts. Then, the word aligner may align words incorrectly, and finally the transfer of the semantic annotations may not be applicable to certain translation pairs.</Paragraph> <Paragraph position="2"> SemCor quality. The English SemCor corpus has been manually annotated. However, some annotation errors can be found in the texts (see Fellbaum et al., 1998, for SemCor taggers' confidence ratings). As an example, the word pocket in the sentence &quot;He put his hands on his pockets&quot; was incorrectly tagged with the WordNet synset {pouch, sac, sack, pocket -- an enclosed space} instead of the correct one {pocket -- a small pouch in a garment for carrying small articles}.</Paragraph> <Paragraph position="3"> Word alignment quality. The feasibility of the entire MultiSemCor project heavily depends on the availability of an English/Italian word aligner with very good performance in terms of recall and, more importantly, precision.</Paragraph> <Paragraph position="4"> Transfer quality. Even when both the original English annotations and the word alignment are correct, a number of cases still remain for which the transfer of the annotation is not applicable. An annotation is not transferable from the source language to the target when the translation equivalent does not preserve the lexical meaning of the source language. In these cases, if the alignment process puts the two expressions in correspondence, then the transfer of the sense annotation from the source to the target language is not correct.</Paragraph> <Paragraph position="5"> The first main cause of incorrect transfer is represented by translation equivalents which are not cross-language synonyms of the source language words. For example, in a sentence of the corpus the English word meaning is translated with the Italian word motivo (reason, grounds) which is suitable in that specific context but is not a synonymic translation of the English word. In this case, if the two words are aligned, the transfer of the sense annotation from English is not correct as the English sense annotation is not suitable for the Italian word. A specific case of non-synonymous translation occurs when a translation equivalent does not belong to the same lexical category of the source word. For example, the English verb to coexist in the sentence &quot;the possibility for man to coexist with animals&quot; has been translated with the Italian noun coesistenza (coexistence) in &quot;le possibilita di coesistenza tra gli uomini e gli animali&quot;. Even if the translation is suitable for that context, the English sense of the verb cannot be transferred to the Italian noun. Sometimes, non-synonymous translations are due to errors in the Italian translation, as in pull translated as spingere (push).</Paragraph> <Paragraph position="6"> A second case which offers challenge to the sense annotation transfer is phrasal correspondence, occurring when a target phrase has globally the same meaning as the corresponding source phrase, but the single words of the phrase are not cross-language synonyms of their corresponding source words. For example, the expression a dreamer sees has been translated as una persona sogna (a person dreams). The Italian translation maintains the synonymy at the phrase level but the single component words do not.</Paragraph> <Paragraph position="7"> Therefore, if the single words were aligned any transfer from English to Italian would be incorrect. Another example of phrasal correspondence, in which the semantic equivalence between words in the source and target phrase is even fuzzier, is given by the English phrase the days would get shorter and shorter translated as imminente fine dei tempi (imminent end of times).</Paragraph> <Paragraph position="8"> Another controversial cause of possible incorrect transfer is represented by the case in which the translation equivalent is indeed a cross-language synonym of the source expression but it is not a lexical unit. This usually happens with lexical gaps, i.e. when a language expresses a concept with a lexical unit whereas the other language expresses the same concept with a free combination of words, as for instance the English word successfully which can only be translated with the Italian free combination of words con successo (with success). However, it can also be the result of a choice made by the translator who decides to use a free combination of words instead of a possible lexical unit, as in empirically translated as in modo empirico (in an empirical manner) instead of empiricamente. In these cases the problem arises because in principle if the target expression is not a lexical unit it cannot be annotated as a whole. On the contrary, each component of the free combination of words should be annotated with its respective sense.</Paragraph> <Paragraph position="9"> In the next Section we will address these quality issues in order to assess the extent to which they affect the cross-language annotation transfer methodology.</Paragraph> </Section> <Section position="6" start_page="1" end_page="1" type="metho"> <SectionTitle> 4 Evaluation of the Annotation Transfer </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="1" end_page="1" type="sub_section"> <SectionTitle> Methodology </SectionTitle> <Paragraph position="0"> A number of experiments have been carried out in order to test the various steps involved in the annotation transfer methodology. More precisely, we evaluated the performances of the word alignment system and the quality of the final annotation of the Italian corpus.</Paragraph> </Section> <Section position="2" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 4.1 Word Alignment </SectionTitle> <Paragraph position="0"> Word alignment is the first crucial step in the methodology applied to build MultiSemCor. The word aligner used in the project is KNOWA (KNOwledge-intensive Word Aligner), an English/Italian word aligner, developed at ITC-irst, which relies mostly on information contained in the Collins bilingual dictionary, available in electronic format. KNOWA also exploits a morphological analyzer and a multiword recognizer for both English and Italian. For a detailed discussion of the characteristics of this tool, see (Pianta and Bentivogli, 2004).</Paragraph> <Paragraph position="1"> Some characteristics of the MultiSemCor scenario make the alignment task easier for KNOWA. First, in SemCor all multiwords included in WordNet are explicitly marked. Thus KNOWA does not need to recognize English multiwords, although it still needs to recognize the Italian ones. Second, within MultiSemCor word alignment is done with the final aim of transferring lexical annotations from English to Italian. Since only content words have word sense annotations in SemCor, it is more important that KNOWA behaves correctly on content words, which are easier to align than functional words.</Paragraph> <Paragraph position="2"> To evaluate the word aligner performance on the MultiSemCor task we created a gold standard composed of three English unseen texts (br-f43, br-l10, br-j53) taken randomly from the SemCor corpus. For each English text both a controlled and a free translation were made. Given the expectation that free translations are less suitable for word alignment, we decided to test KNOWA also on them in order to verify if the annotation transfer methodology can be applied to already existing parallel corpora.</Paragraph> <Paragraph position="3"> The six resulting pairs of texts were manually aligned following a set of alignment guidelines which have been defined taking into account the work done in similar word alignment projects (Melamed, 2001). Annotators were asked to align different kinds of units (simple words, segments of more than one word, parts of words) and to mark different kinds of semantic correspondence between the aligned units, e.g. full correspondence (synonymic), non synonymic, changes in lexical category, phrasal correspondence. Inter-annotator agreement was measured with the Dice coefficient proposed in (Veronis and Langlais, 2000) and can be considered satisfactory as it turned out to be 87% for free translations and 92% for controlled translations. As expected, controlled translations produced a better agreement rate between annotators.</Paragraph> <Paragraph position="4"> For assessing the performance of KNOWA, the standard notions of Precision, Recall, and Coverage have been used following (Veronis and Langlais, 2000). See (Och and Ney, 2003) and Arenberg et al., 2000) for different evaluation metrics. The performance of KNOWA applied to the MultiSemCor gold standard in a full-text alignment task is shown in Table 1. These results, which compare well with those reported in the literature (Veronis, 2000) show that, as expected, controlled translations allow for a better alignment but also that free translations may be satisfactorily aligned.</Paragraph> <Paragraph position="5"> The evaluation of KNOWA with respect to the English content words which have a semantic tag in SemCor is reported in Tables 2 and 3, for both free and controlled translations and broken down</Paragraph> <Paragraph position="7"> We can see that ignoring function words the performance of the word aligner improves in both precision and recall.</Paragraph> </Section> <Section position="3" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 4.2 Italian Annotation Quality </SectionTitle> <Paragraph position="0"> As pointed out in Section 3, even in the case of a perfect word alignment the transfer of the annotations from English to the correctly aligned Italian words can still be a source of errors in the resulting Italian annotations. In order to evaluate the quality of the annotations automatically transferred to Italian, a new gold standard was created starting from SemCor text br-g11. The English text, containing 2,153 tokens and 1,054 semantic annotations, was translated into Italian in a controlled modality. The resulting Italian text is composed of 2,351 tokens, among which 1,085 are content words to be annotated. The English text and its Italian translation were manually aligned and the Italian text was manually semantically annotated taking into account the annotations of the English words. Each time an English annotation was appropriate for the Italian corresponding word, the annotator used it also for Italian. Otherwise, the annotator did not use the original English annotation for the Italian word and looked in WordNet for a suitable annotation.</Paragraph> <Paragraph position="1"> Moreover, when the English annotations were not suitable for annotating the Italian words, the annotator explicitly distinguished between wrong English annotations and English annotations that could not be transferred to the Italian translation equivalents. The errors in the English annotations amount to 24 cases. Non-transferable annotations amount to 155, among which 143 are due to lack of synonymy at lexical level and 12 to translation equivalents which are not lexical units.</Paragraph> <Paragraph position="2"> The differences between the English and Italian text with respect to the number of tokens and annotations have also been analysed. The Italian text has about 200 tokens and 31 annotated words more than the English text. The difference in the number of tokens is due to various factors. First, there are grammatical characteristics specific to the Italian language, such as a different usage of articles, or a greater usage of reflexive verbs which leads to a higher number of clitics. For example, the English sentence &quot;as cells coalesced&quot; must be translated into Italian as &quot;quando le cellule si unirono&quot;. Then, we have single English words translated into Italian with free combinations of words (ex: down translated as verso il basso) and multiwords which are recognized in English and not recognized in Italian (e.g. one token for nucleic_acid in the English text and two tokens in the Italian text, one for acido and one for nucleico). As regards content words to be annotated, we would have expected that their number was the same both in English and Italian.</Paragraph> <Paragraph position="3"> In fact, the difference we found is much lower than the difference between tokens. This difference is explained by the fact that some English content words have not been annotated. For example, modal and auxiliary verbs (to have, to be, can, may, to have to, etc.) and partitives (some, any) where systematically left unannotated in the English text whereas they have been annotated for Italian.</Paragraph> <Paragraph position="4"> The automatic procedures for word alignment and annotation transfer were run on text br-g11 and evaluated against the gold standard. The total number of transferred senses amounts to 879.</Paragraph> <Paragraph position="5"> Among them, 756 are correct and 123 are incorrect for the Italian words. Table 4 summarizes the results in terms of precision, recall and coverage with respect to both English annotations available (1,054) and Italian words to be annotated (1,085).</Paragraph> <Paragraph position="6"> We can see that the final quality of the Italian annotations is acceptable, the precision amounting to 86.0%. The annotation error rate of 14.0% has been analyzed in order to classify the different factors affecting the transfer methodology. Table 5 reports the data about the composition of the incorrect transfer.</Paragraph> <Paragraph position="7"> Comparing the number of annotation errors in the English source, as marked up during the creation of the gold standard (24), with the number of errors in the Italian annotation due to errors in the original annotation (22), we can see that almost all of the source errors have been transferred, contributing in a consistent way to the overall Italian annotation error rate.</Paragraph> <Paragraph position="8"> As regards word alignment, br-g11 was a relatively easy text as the performance of KNOWA (i.e. 96.5%) is higher than that obtained with the test set (see Table 3).</Paragraph> <Paragraph position="9"> The last source of annotation errors is represented by words which have been correctly aligned but whose word sense annotation cannot be transferred. This happens with (i) translation equivalents which are lexical units but are not cross-language synonyms, and (ii) translation equivalents which are cross-language synonyms but are not lexical units. In practice, given the difficulty in deciding what is a lexical unit and what is not, we decided to accept the transfer of a word sense from an English lexical unit to an Italian free combination of words (see for instance occhiali da sole annotated with the sense of sunglasses). Therefore, only the lack of synonymy at lexical level has been considered an annotation error.</Paragraph> <Paragraph position="10"> The obtained results are encouraging. Among the 143 non-synonymous translations marked in the gold standard, only 70 have been aligned by the word alignment system, showing that KNOWA is well suited to the MultiSemCor task. The reason is that it relies on bilingual dictionaries where non-synonymous translations are quite rare. This can be an advantage with respect to statistics-based word aligners, which are expected to be able to align a great number of non-synonymous translations, thus introducing more errors in the transfer procedure.</Paragraph> <Paragraph position="11"> A final remark about the evaluation concerns the proportion of non-transferable word senses with respect to errors in the original English annotations. It is sometimes very difficult to distinguish between annotation errors and non-transferable word senses, also because we are not English native speakers. Thus, we preferred to be conservative in marking English annotations as errors unless in very clear cases. This approach may have reduced the number of the errors in the original English corpus and augmented the number of non-transferable word senses, thus penalizing the transfer procedure itself.</Paragraph> <Paragraph position="12"> Summing up, the cross-language annotation transfer methodology produces an Italian corpus which is tagged with a final precision of 86.0%.</Paragraph> <Paragraph position="13"> After the application of the methodology 19.0% of the Italian words still need to be annotated (see the annotation coverage of 81.0%). We think that, given the precision and coverage rates obtained from the evaluation, the corpus as it results from the automatic procedure can be profitably used.</Paragraph> <Paragraph position="14"> However, even in the case that a manual revision is envisaged, we think that hand-checking the automatically tagged corpus and manually annotating the remaining 19% still results to be cost effective with respect to annotating the corpus from scratch.</Paragraph> </Section> </Section> <Section position="7" start_page="1" end_page="1" type="metho"> <SectionTitle> 5 The MultiSemCor Corpus Up to Now </SectionTitle> <Paragraph position="0"> We are currently working at the extensive application of the annotation transfer methodology for the creation of the MultiSemCor corpus. Up to now, MultiSemCor is composed of 29 English texts aligned at the word level with their corresponding Italian translations. Both source and target texts are annotated with POS, lemma, and word sense. More specifically, as regards English we have 55,935 running words among which 29,655 words are semantically annotated (from SemCor). As for Italian, the corpus amounts to 59,726 running words among which 23,095 words are annotated with word senses that have been automatically transferred from English.</Paragraph> <Paragraph position="1"> MultiSemCor can be a useful resource for a variety of tasks, both as a monolingual semantically annotated corpus and as a parallel aligned corpus. As an example, we are already using it to automatically enrich the Italian component of MultiWordNet, the reference lexicon of MultiSemCor. As a matter of fact, out of the 23,095 Italian words automatically sense-tagged, 5,292 are not yet present in MultiWordNet and will be added to it. Moreover, the Italian component of MultiSemCor is being used as a gold standard for the evaluation of Word Sense Disambiguation systems working on Italian. Besides NLP applications, MultiSemCor is also suitable to be consulted by humans through a Web interface (Ranieri et al., 2004) which is available at: http://tcc.itc.it/projects/multisemcor.</Paragraph> </Section> class="xml-element"></Paper>