File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/n03-2006_intro.xml
Size: 1,621 bytes
Last Modified: 2025-10-06 14:01:44
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-2006"> <Title>Adaptation Using Out-of-Domain Corpus within EBMT</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Example-Based Machine Translation (EBMT) is adaptable to new domains. If you simply prepare a bilingual corpus of a new domain, you'll get a translation system for the domain. However, if only a small-sized corpus is available, low translation quality is obtained. We explored methods to boost translation quality based on a small-sized bilingual corpus in the domain. Among these methods, we use an out-of-domain bilingual corpus and, in addition, the language model (LM) of an in-domain monolingual corpus. For accuracy of the LM, a larger training set is better. The training set is a target language corpus, which can be more easily prepared than a bilingual corpus.</Paragraph> <Paragraph position="1"> In prior works, statistical machine translation (Brown, 1993) used not only LM but also translation models. However, making a translation model requires a bilingual corpus. On the other hand, in some studies on multiple-translation selection, the LM of the target language is used to calculate translation scores (Kaki, 1999; Callison-Burch, 2001). For adaptation, we use the LM of an in-domain target language.</Paragraph> <Paragraph position="2"> In the following sections, we describe the methods using an out-of-domain bilingual corpus and an in-domain monolingual corpus. Moreover, we report on our experiments.</Paragraph> </Section> class="xml-element"></Paper>