File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/n03-2006_metho.xml
Size: 7,752 bytes
Last Modified: 2025-10-06 14:08:16
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-2006"> <Title>Adaptation Using Out-of-Domain Corpus within EBMT</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Adaptation Methods </SectionTitle> <Paragraph position="0"> EBMT (Nagao, 1984) retrieves the translation examples that are most similar to an input expression and adjusts the examples to obtain the translation. The EBMT system in our approach retrieves not only in-domain examples, but also out-of-domain examples.</Paragraph> <Paragraph position="1"> When using out-of-domain examples, suitability to the target domain is considered. We tried the following three types of adaptation methods.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> (1) Merging equally </SectionTitle> <Paragraph position="0"> An in-domain corpus and an out-of-domain corpus are simply merged and used without distinction.</Paragraph> <Paragraph position="1"> (2) Merging with preference for in-domain corpus An in-domain corpus and an out-of-domain corpus are merged. However, when multiple examples with the same similarity are retrieved, the in-domain examples are used.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> (3) Using LM </SectionTitle> <Paragraph position="0"> Beforehand, we make an LM of an in-domain target language corpus and, according to the LM, assign a probability to the target sentence of each out-of-domain example.</Paragraph> <Paragraph position="1"> In the example retrieval phase of the EBMT system, two types of examples are handled differently.</Paragraph> <Paragraph position="2"> (3-1) From in-domain examples, the most similar examples are retrieved.</Paragraph> <Paragraph position="3"> (3-2) From out-of-domain examples, not only the most similar examples but also other examples that are nearly as similar are retrieved. In the retrieved examples, examples with the highest probabilities of their target sentences by the LM are selected.</Paragraph> <Paragraph position="4"> (3-3) From the results of both (3-1) and (3-2), the most similar examples are selected. Examples of (3-1) are used when the similarities are equal to each other.</Paragraph> </Section> <Section position="6" start_page="0" end_page="3" type="metho"> <SectionTitle> 3 Translation Experiments </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="3" type="sub_section"> <SectionTitle> 3.1 Conditions </SectionTitle> <Paragraph position="0"> In order to evaluate the adaptability of an EBMT with out-of-domain examples, we applied the methods described in Section 2 to the EBMT and evaluated the translation quality in Japanese-to-English translation.</Paragraph> <Paragraph position="1"> We used an EBMT, DP-match Driven transDucer (D , Sumita, 2001) as a test bed.</Paragraph> <Paragraph position="2"> We used two Japanese-and-English bilingual corpora. In this experiment on adaptation, as an out-of-domain corpus, we used Basic Travel Expression Corpus (BTEC, described as BE-corpus in Takezawa, 2002); as an in-domain corpus, we used a telephone conversation corpus (TEL). The statistics of the corpora are shown in Table 1. TEL is split into two parts: a test set of 1,653 sentence pairs and a training set of 9,918. Perplexities reveal the large difference between the in-domain and out-of-domain corpora.</Paragraph> <Paragraph position="3"> The translation qualities were evaluated by the BLEU score (Papineni, 2001) and the NIST score (Doddington, 2002). The evaluation methods compare the system output translation with a set of reference translations of the same source text by finding sequences of words in the reference translations that match those in the system output translation. We used the English sentence corresponding to each input Japanese sentence in the test set as the reference translation. Therefore, achieving a better score by the evaluation means that the translation results can be regarded as more adequate translations for the domain.</Paragraph> <Paragraph position="4"> In order to simulate incremental expansion of an in-domain bilingual corpus and to observe the relationship between corpus size and translation quality, translations were performed with some subsets of the training corpus. The numbers of the sentence pairs are 0, 1000, .. , 5000 and 9918, adding randomly selected examples from the training set.</Paragraph> <Paragraph position="5"> The LM of the domain's target language was the word trigram model of the English sentences of the training set of TEL. We tried two patterns of training set quantities in making the LM: 1) all of the training set, and 2) the part of the set used for translation examples according to the numbers mentioned above.</Paragraph> </Section> <Section position="2" start_page="3" end_page="3" type="sub_section"> <SectionTitle> 3.2 Results </SectionTitle> <Paragraph position="0"> Table 2 shows the BLEU scores from the translation experiment, which show certain tendencies. Generally, by using more in-domain examples, the translation results steadily achieve better scores. The score when using 4,000 in-domain examples exceeded that when using 152,172 out-of-domain examples. Equal merging outperformed using only out-of-domain examples.</Paragraph> <Paragraph position="1"> Merging with in-domain preference outperformed equal merging, and using LM outperformed merging with in-domain preference. Comparing the two cases using LM, using LM made from all of the training set got a slightly better scores than the other, which implies that better LM is made from a larger corpus. All of the adaptation methods are more effective when a smaller-sized in-domain corpus is available. When using no in-domain examples, the effect of using LM made from the entire training set was relatively large.</Paragraph> <Paragraph position="2"> Table 3 shows the NIST scores for the same experiment. We can observe the same tendencies as in the table of BLEU scores, except that the advantage of using LM made from all of the training set over that from a partial set was not observed.</Paragraph> </Section> </Section> <Section position="7" start_page="3" end_page="3" type="metho"> <SectionTitle> 4 Conclusion and Future Work </SectionTitle> <Paragraph position="0"> A corpus-based approach is able to quickly build a machine translation system for a new domain if a bilingual corpus of that domain is available. However, if only a small-sized corpus is available, a low translation quality is obtained. In order to boost the performance, several methods using out-of-domain data were explored in this paper. The experimental results showed the effect of using an out-of-domain corpus by two evaluation measures, i.e., the BLEU score and the NIST score.</Paragraph> <Paragraph position="1"> We also showed the possibility of increasing the translation quality by using the LM of the domain's target language. However, the gains from using the LM in the evaluation scores were not significant. We must continue experiments with other corpora and under various conditions. In addition, though we've implicitly assumed a high-quality in-domain corpus, next we'd like to investigate using a low-quality corpus.</Paragraph> </Section> <Section position="8" start_page="3" end_page="3" type="metho"> <SectionTitle> Acknowledgements </SectionTitle> <Paragraph position="0"> The research reported here was supported in part by a contract with the Telecommunications Advancement Organization of Japan entitled, &quot;A study of speech dialogue translation technology based on a large corpus&quot;.</Paragraph> </Section> class="xml-element"></Paper>