File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/c04-1053_relat.xml
Size: 2,421 bytes
Last Modified: 2025-10-06 14:15:45
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1053"> <Title>Evaluating Cross-Language Annotation Transfer in the MultiSemCor Corpus</Title> <Section position="4" start_page="1" end_page="1" type="relat"> <SectionTitle> 2.1 Related Work </SectionTitle> <Paragraph position="0"> The idea of obtaining linguistic information about a text in one language by exploiting parallel or comparable texts in another language has been explored in the field of Word Sense Disambiguation (WSD) since the early 90's, the most representative works being (Brown et al., 1991), (Gale et al., 1992), and (Dagan and Itai, 1994).</Paragraph> <Paragraph position="1"> In more recent years, Ide et al. (2002) present a method to identify word meanings starting from a multilingual corpus. A by-product of applying this method is that once a word in one language is word-sense tagged, the translation equivalents in the parallel texts are also automatically annotated. Cross-language tagging is the goal of the work by Diab and Resnik (2002), who present a method for word sense tagging both the source and target texts of parallel bilingual corpora with the WordNet sense inventory.</Paragraph> <Paragraph position="2"> Parallel to the studies regarding the projection of semantic information, more recently the NLP community has also explored the possibility of exploiting translation to project more syntax-oriented annotations. Yarowsky et al. (2001) describe a successful method consisting of (i) automatic annotation of English texts, (ii) cross-language projection of annotations onto target language texts, and (iii) induction of noise-robust taggers for the target language. A further step is made in (Hwa et al., 2002) and (Cabezas et al., 2001), which address the task of acquiring a dependency treebank by bootstrapping from existing linguistic resources for English. Finally, in (Riloff et al., 2002) a method is presented for rapidly creating Information Extraction (IE) systems for new languages by exploiting existing IE systems via cross-language projection.</Paragraph> <Paragraph position="3"> The results of all the above mentioned studies show how previous major investments in English annotated corpora and tool development can be effectively leveraged across languages, allowing the development of accurate resources and tools in other languages without comparable human effort.</Paragraph> </Section> class="xml-element"></Paper>