File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/n04-1022_intro.xml
Size: 3,810 bytes
Last Modified: 2025-10-06 14:02:17
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-1022"> <Title>Minimum Bayes-Risk Decoding for Statistical Machine Translation</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction Statistical Machine Translation systems have achieved </SectionTitle> <Paragraph position="0"> considerable progress in recent years as seen from their performance on international competitions in standard evaluation tasks (NIST, 2003). This rapid progress has been greatly facilitated by the development of automatic translation evaluation metrics such as BLEU score (Papineni et al., 2001), NIST score (Doddington, 2002) and Position Independent Word Error Rate (PER) (Och, 2002). However, given the many factors that influence translation quality, it is unlikely that we will find a single translation metric that will be able to judge all these factors. For example, the BLEU, NIST and the PER metrics, a3 This work was supported by the National Science Foundation under Grant No. 0121285 and an ONR MURI Grant N00014-01-1-0685. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the Office of Naval Research.</Paragraph> <Paragraph position="1"> though effective, do not take into account explicit syntactic information when measuring translation quality.</Paragraph> <Paragraph position="2"> Given that different Machine Translation (MT) evaluation metrics are useful for capturing different aspects of translation quality, it becomes desirable to create MT systems tuned with respect to each individual criterion. In contrast, the maximum likelihood techniques that underlie the decision processes of most current MT systems do not take into account these application specific goals. We apply the Minimum Bayes-Risk (MBR) techniques developed for automatic speech recognition (Goel and Byrne, 2000) and bitext word alignment for statistical MT (Kumar and Byrne, 2002), to the problem of building automatic MT systems tuned for specific metrics. This is a framework that can be used with statistical models of speech and language to develop decision processes optimized for specific loss functions.</Paragraph> <Paragraph position="3"> We will show that MBR decoding can be applied to machine translation in two scenarios. Given an automatic MT metric, we design a loss function based on the metric and use MBR decoding to tune MT performance under the metric. We also show how MBR decoding can be used to incorporate syntactic structure into a statistical MT system by building specialized loss functions. These loss functions can use information from word strings, word-to-word alignments and parse-trees of the source sentence and its translation. In particular we describe the design of a Bilingual Tree Loss Function that can explicitly use syntactic structure for measuring translation quality. MBR decoding under this loss function allows us to integrate syntactic knowledge into a statistical MT system without building detailed models of linguistic features, and retraining the system from scratch.</Paragraph> <Paragraph position="4"> We first present a hierarchy of loss functions for translation based on different levels of lexical and syntactic information from source and target language sentences.</Paragraph> <Paragraph position="5"> This hierarchy includes the loss functions useful in both situations where we intend to apply MBR decoding. We then present the MBR framework for statistical machine translation under the various translation loss functions.</Paragraph> <Paragraph position="6"> We finally report the performance of MBR decoders optimized for each loss function.</Paragraph> </Section> class="xml-element"></Paper>