File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/h01-1033_intro.xml

Size: 1,127 bytes

Last Modified: 2025-10-06 14:01:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="H01-1033">
  <Title>Improved Cross-Language Retrieval using Backoff Translation</Title>
  <Section position="2" start_page="3" end_page="3" type="intro">
    <SectionTitle>
1. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> The effectiveness of a broad class of cross-language information retrieval (CLIR) techniques that are based on term-by-term translation depends on the coverage and accuracy of the available translation lexicon(s). Two types of translation lexicons are commonly used, one based on translation knowledge extracted from bilingual dictionaries [1] and the other based on translation knowledge extracted from bilingual corpora [8]. Dictionaries provide reliable evidence, but often lack translation preference information. Corpora, by contrast, are often a better source for translations of slang or newly coined terms, but the statistical analysis through which the translations are extracted sometimes produces erroneous results. In this paper we explore the question of how best to combine evidencefrom these two sources.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML