File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/p02-1044_intro.xml
Size: 3,602 bytes
Last Modified: 2025-10-06 14:01:29
<?xml version="1.0" standalone="yes"?> <Paper uid="P02-1044"> <Title>Word Translation Disambiguation Using Bilingual Bootstrapping</Title> <Section position="3" start_page="1" end_page="1" type="intro"> <SectionTitle> 2 Related Work </SectionTitle> <Paragraph position="0"> The problem of word translation disambiguation (in general, word sense disambiguation) can be viewed as that of classification and can be addressed by employing a supervised learning method. In such a learning method, for instance, an English sentence containing an ambiguous English word corresponds to an example, and the Chinese translation of the word under the context corresponds to a classification decision (a label).</Paragraph> <Paragraph position="1"> Many methods for word sense disambiguation using a supervised learning technique have been proposed. They include those using Naive Bayes (Gale et al. 1992a), Decision List (Yarowsky 1994), Nearest Neighbor (Ng and Lee 1996), Transformation Based Learning (Mangu and Brill 1997), Neural Network (Towell and In this paper, we take English-Chinese translation as example; it is a relatively easy process, however, to extend the discussions to translations between other language pairs.</Paragraph> <Paragraph position="2"> Computational Linguistics (ACL), Philadelphia, July 2002, pp. 343-351. Proceedings of the 40th Annual Meeting of the Association for Voorhess 1998), Winnow (Golding and Roth 1999), Boosting (Escudero et al. 2000), and Naive Bayesian Ensemble (Pedersen 2000).</Paragraph> <Paragraph position="3"> Among these methods, the one using Naive Bayesian Ensemble (i.e., an ensemble of Naive Bayesian Classifiers) is reported to perform the best for word sense disambiguation with respect to a benchmark data set (Pedersen 2000).</Paragraph> <Paragraph position="4"> The assumption behind the proposed methods is that it is nearly always possible to determine the translation of a word by referring to its context, and thus all of the methods actually manage to build a classifier (i.e., a classification program) using features representing context information (e.g., co-occurring words).</Paragraph> <Paragraph position="5"> Since preparing supervised learning data is expensive (in many cases, manually labeling data is required), it is desirable to develop a bootstrapping method that starts learning with a small number of classified data but is still able to achieve high performance under the help of a large number of unclassified data which is not expensive anyway.</Paragraph> <Paragraph position="6"> Yarowsky (1995) proposes a method for word sense disambiguation, which is based on Monolingual Bootstrapping. When applied to our current task, his method starts learning with a small number of English sentences which contain an ambiguous English word and which are respectively assigned with the correct Chinese translations of the word. It then uses the classified sentences as training data to learn a classifier (e.g., a decision list) and uses the constructed classifier to classify some unclassified sentences containing the ambiguous word as additional training data. It also adopts the heuristics of 'one sense per discourse' (Gale et al. 1992b) to further classify unclassified sentences. By repeating the above processes, it can create an accurate classifier for word translation disambiguation.</Paragraph> <Paragraph position="7"> For other related work, see, for example, (Brown et al. 1991; Dagan and Itai 1994; Pedersen and Bruce 1997; Schutze 1998; Kikui 1999; Mihalcea and Moldovan 1999).</Paragraph> </Section> class="xml-element"></Paper>