File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/p99-1028_intro.xml
Size: 3,152 bytes
Last Modified: 2025-10-06 14:06:57
<?xml version="1.0" standalone="yes"?> <Paper uid="P99-1028"> <Title>Resolving Translation Ambiguity and Target Polysemy in Cross-Language Information Retrieval</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> Cross language information retrieval (CLIR) (Oard and Dorr, 1996; Oard, 1997) deals with the use of queries in one language to access documents in another. Due to the differences between source and target languages, query translation is usually employed to unify the language in queries and documents. In query translation, translation ambiguity is a basic problem to be resolved. A word in a source query may have more than one sense. Word sense disambiguation identifies the correct sense of each source word, and lexical selection translates it into the corresponding target word. The above procedure is similar to lexical choice operation in a traditional machine translation (MT) system. However, there is a significant difference between the applications of MT and CLIR. In MT, readers interpret the translated results. If the target word has more than one sense, readers can disambiguate its meaning automatically. Comparatively, the translated result is sent to a monolingual information retrieval system in CLIR. The target polysemy adds extraneous senses and affects the retrieval performance.</Paragraph> <Paragraph position="1"> Some different approaches have been proposed for query translation. Dictionary-based approach exploits machine-readable dictionaries and selection strategies like select all (Hull and Grefenstette, 1996; Davis, 1997), randomly select N (Ballesteros and Croft, 1996; Kwok 1997) and select best N (Hayashi, Kikui and Susaki, 1997; Davis 1997). Corpus-based approaches exploit sentence-aligned corpora (Davis and Dunning, 1996) and document-aligned corpora (Sheridan and Ballerini, 1996). These two approaches are complementary. Dictionary provides translation candidates, and corpus provides context to fit user intention. Coverage of dictionaries, alignment performance and domain shift of corpus are major problems of these two approaches. Hybrid approaches (Ballesteros and Croft, 1998; Bian and Chen, 1998; Davis 1997) integrate both lexical and corpus knowledge.</Paragraph> <Paragraph position="2"> All the above approaches deal with the translation ambiguity problem in query translation. Few touch on translation ambiguity and target polysemy together. This paper will study the multiplication effects of translation ambiguity and target polysemy in cross-language information retrieval systems, and propose a new translation method to resolve these problems. Section 2 shows the effects of translation ambiguity and target polysemy in Chinese-English and English-Chinese information retrievals. Section 3 presents several models to revolve translation ambiguity and target polysemy problems.</Paragraph> <Paragraph position="3"> Section 4 demonstrates the experimental results, and compares the performances of the proposed models. Section 5 concludes the remarks.</Paragraph> </Section> class="xml-element"></Paper>