XML Viewer - n04-1033

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/n04-1033_concl.xml
Size: 2,850 bytes
Last Modified: 2025-10-06 13:54:03
<?xml version="1.0" standalone="yes"?>
<Paper uid="N04-1033">
  <Title>Improvements in Phrase-Based Statistical Machine Translation</Title>
  <Section position="11" start_page="0" end_page="0" type="concl">
    <SectionTitle>
10 Conclusions
</SectionTitle>
    <Paragraph position="0"> We described a phrase-based translation approach. The basic idea of this approach is to remember all bilingual phrases that have been seen in the word-aligned training corpus. As refinements of the baseline model, we described two simple heuristics: the word penalty feature and the phrase penalty feature. Additionally, we presented a single-word based lexicon with two smoothing methods. The model scaling factors were optimized with respect to the mWER on the development corpus.</Paragraph>
    <Paragraph position="1"> We described a highly efficient monotone search algorithm. The worst-case complexity of this algorithm is linear in the sentence length. This leads to an impressive translation speed of more than 1000 words per second for the Verbmobil task and for the Xerox task. Even for the Canadian Hansards task the translation of sentences of length 30 takes only about 1.5 seconds.</Paragraph>
    <Paragraph position="2"> The described search is monotone at the phrase level.</Paragraph>
    <Paragraph position="3"> Within the phrases, there are no constraints on the reorderings. Therefore, this method is best suited for language pairs that have a similar order at the level of the phrases learned by the system. Thus, the translation process should require only local reorderings. As the experiments have shown, Spanish-English and French-English are examples of such language pairs. For these pairs, the monotone search was found to be sufficient. The phrase-based approach clearly outperformed the single-word based systems. It showed even better performance than the alignment template system.</Paragraph>
    <Paragraph position="4"> The experiments on the German-English Verbmobil task outlined the limitations of the monotone search.</Paragraph>
    <Paragraph position="5"> As the low degree of monotonicity indicated, reordering plays an important role on this task. The rather free word order in German as well as the verb group seems to be difficult to translate. Nevertheless, when ignoring the word order and looking at the mPER only, the monotone search is competitive with the best performing system.</Paragraph>
    <Paragraph position="6"> For further improvements, we will investigate the usefulness of additional models, e.g. modeling the segmentation probability. Also, slightly relaxing the monotonicity constraint in a way that still allows an efficient search is of high interest. In spirit of the IBM reordering constraints of the single-word based models, we could allow a phrase to be skipped and to be translated later.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML