File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1067_intro.xml
Size: 3,571 bytes
Last Modified: 2025-10-06 14:03:36
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1067"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Distortion Models For Statistical Machine Translation</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> A language model is a statistical model that gives a probability distribution over possible sequences of words. It computes the probability of producing a given word w1 given all the words that precede it in the sentence. An n-gram language model is an n-th order Markov model where the probability of generating a given word depends only on the last n 1 words immediately preceding it and is given by the following equation:</Paragraph> <Paragraph position="2"> where k >= n.</Paragraph> <Paragraph position="3"> N-gram language models have been successfully used in Automatic Speech Recognition (ASR) as was first proposed by (Bahl et al., 1983). They play an important role in selecting among several candidate word realization of a given acoustic signal. N-gram language models have also been used in Statistical Machine Translation (SMT) as proposed by (Brown et al., 1990; Brown et al., 1993). The run-time search procedure used to find the most likely translation (or transcription in the case of Speech Recognition) is typically referred to as decoding.</Paragraph> <Paragraph position="4"> There is a fundamental difference between decoding for machine translation and decoding for speech recognition. When decoding a speech signal, words are generated in the same order in which their corresponding acoustic signal is consumed. However, that is not necessarily the case in MT due to the fact that different languages have different word order requirements. For example, in Spanish and Arabic adjectives are mainly noun post-modifiers, whereas in English adjectives are noun pre-modifiers. Therefore, when translating between Spanish and English, words must usually be reordered. null Existing statistical machine translation decoders have mostly relied on language models to select the proper word order among many possible choices when translating between two languages. In this paper, we argue that a language model is not sufficient to adequately address this issue, especially when translating between languages that have very different word orders as suggested by our experimental results in Section 5.</Paragraph> <Paragraph position="5"> We propose a new distortion model that can be used as an additional component in SMT decoders. This new model leads to significant improvements in MT quality as measured by BLEU (Papineni et al., 2002).</Paragraph> <Paragraph position="6"> The experimental results we report in this paper are for Arabic-English machine translation of news stories.</Paragraph> <Paragraph position="7"> We also present a novel method for measuring word order similarity (or differences) between any given pair of languages based on word alignments as described in Section 3.</Paragraph> <Paragraph position="8"> The rest of this paper is organized as follows. Section 2 presents a review of related work. In Section 3 we propose a method for measuring the distortion between any given pair of languages. In Section 4, we present our proposed distortion model. In Section 5, we present some empirical results that show the utility of our distortion model for statistical machine translation systems. Then, we conclude this paper with a discussion in Section 6.</Paragraph> </Section> class="xml-element"></Paper>