File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/n04-1022_concl.xml
Size: 5,626 bytes
Last Modified: 2025-10-06 13:54:02
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-1022"> <Title>Minimum Bayes-Risk Decoding for Statistical Machine Translation</Title> <Section position="6" start_page="0" end_page="0" type="concl"> <SectionTitle> 5 Discussion </SectionTitle> <Paragraph position="0"> We have described the formulation of Minimum Bayes-Risk decoders for machine translation. This is a general framework that allows us to build special purpose decoders from general purpose models. The procedure aims at direct minimization of the expected risk of translation errors under a given loss function. In this paper we have focused on two situations where this framework could be applied.</Paragraph> <Paragraph position="1"> Given an MT evaluation metric of interest such as BLEU, PER or WER, we can use this metric as a loss function within the MBR framework to design decoders optimized for the evaluation criterion. In particular, the MBR decoding under the BLEU loss function can yield further improvements on top of MAP decoding.</Paragraph> <Paragraph position="2"> Suppose we are interested in improving syntactic structure of automatic translations and would like to use an existing statistical MT system that is trained without any linguistic features. We have shown in such a situation how MBR decoding can be applied to the MT system.</Paragraph> <Paragraph position="3"> This can be done by the design of translation loss functions from varied linguistic analyzes. We have shown the construction of a Bitree loss function to compare parse-trees of any two translations using alignments with respect to a parse-tree for the source sentence. The loss function therefore avoids the problem of unconstrained tree-to-tree alignment. Using an example, we have shown that this loss function can measure qualities of translation that string (and ngram) based metrics cannot capture. The MBR decoder under this loss function gives improvements under an evaluation metric based on the loss function.</Paragraph> <Paragraph position="4"> We present results under the Bitree loss function as an example of incorporating linguistic information into a loss function; we have not yet measured its correlation with human assessments of translation quality. This loss function allows us to integrate syntactic structure into the statistical MT framework without building detailed models of syntactic features and retraining models from scratch. However, we emphasize that the MBR techniques do not preclude the construction of complex models of syntactic structure. Translation models that have been trained with linguistic features could still benefit by the application of MBR decoding procedures.</Paragraph> <Paragraph position="5"> That machine translation evaluation continues to be an active area of research is evident from recent workshops (AMTA, 2003). We expect new automatic MT evaluation metrics to emerge frequently in the future.</Paragraph> <Paragraph position="6"> Given any translation metric, the MBR decoding framework will allow us to optimize existing MT systems for the new criterion. This is intended to compensate for any mismatch between decoding strategy of MT systems and their evaluation criteria. While we have focused on developing MBR procedures for loss functions that measure various aspects of translation quality, this framework can also be used with loss functions which measure application-specific error criteria.</Paragraph> <Paragraph position="7"> We now describe related training and search procedures for NLP that explicitly take into consideration task-specific performance metrics. Och (2003) developed a training procedure that incorporates various MT evaluation criteria in the training procedure of log-linear MT models. Foster et al. (2002) developed a text-prediction system for translators that maximizes expected benefit to the translator under a statistical user model. In parsing, Goodman (1996) developed parsing algorithms that are appropriate for specific parsing metrics. There has also been recent work that combines 1-best hypotheses from multiple translation systems (Bangalore et al., 2002); this approach uses string-edit distance to align the hypotheses and rescores the resulting lattice with a language model.</Paragraph> <Paragraph position="8"> In future work we plan to extend the search space of MBR decoders to translation lattices produced by the baseline system. Translation lattices (Ueffing et al., 2002; Kumar and Byrne, 2003) are a compact representation of a large set of most likely translations generated by an MT system. While an a50 -best list contains only a limited re-ordering of hypotheses, a translation lattice will contain hypotheses with a vastly greater number of re-orderings.</Paragraph> <Paragraph position="9"> We are developing efficient lattice search procedures for MBR decoders. By extending the search space of the decoder to a much larger space than the a50 -best list, we expect further performance improvements.</Paragraph> <Paragraph position="10"> MBR is a promising modeling framework for statistical machine translation. It is a simple model rescoring framework that improves well-trained statistical models For each metric, the performance under a matched condition is shown in bold. Note that better results correspond to higher BLEU scores and to lower error rates.</Paragraph> <Paragraph position="11"> by tuning them for particular criteria. These criteria could come from evaluation metrics or from other desiderata (such as syntactic well-formedness) that we wish to see in automatic translations.</Paragraph> </Section> class="xml-element"></Paper>