File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/n04-4026_abstr.xml
Size: 992 bytes
Last Modified: 2025-10-06 13:43:31
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-4026"> <Title>A Unigram Orientation Model for Statistical Machine Translation</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> In this paper, we present a unigram segmentation model for statistical machine translation where the segmentation units are blocks: pairs of phrases without internal structure. The segmentation model uses a novel orientation component to handle swapping of neighbor blocks. During training, we collect block unigram counts with orientation: we count how often a block occurs to the left or to the right of some predecessor block. The orientation model is shown to improve translation performance over two models: 1) no block re-ordering is used, and 2) the block swapping is controlled only by a language model. We show experimental results on a standard Arabic-English translation task.</Paragraph> </Section> class="xml-element"></Paper>