File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/e06-1006_concl.xml
Size: 1,301 bytes
Last Modified: 2025-10-06 13:55:03
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-1006"> <Title>Phrase-Based Backoff Models for Machine Translation of Highly Inflected Languages</Title> <Section position="9" start_page="46" end_page="46" type="concl"> <SectionTitle> 8 Conclusions </SectionTitle> <Paragraph position="0"> We have presented a backoff model for phrase-based SMT that uses morphological abstractions to translate unseen word forms in the foreign language input. When a match for an unknown word in the test set cannot be found in the trained phrase table, the model relies instead on translation probabilities derived from stemmed or split versions of the word in its phrasal context. An evaluation of the model on German-English and Finnish-English translations of parliamentary proceedings showed statistically significant improvements in PER for almost all training conditions and significant improvements in BLEU when the training set is small (100K words), with larger improvements for Finnish than for German. This demonstrates that our method is mainly relevant for highly inflected languages and sparse training data conditions. It is also designed to improve human acceptance of machine translation output, which is particularly adversely affected by untranslated words.</Paragraph> </Section> class="xml-element"></Paper>