File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/c04-1017_concl.xml

Size: 1,584 bytes

Last Modified: 2025-10-06 13:53:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1017">
  <Title>Splitting Input Sentence for Machine Translation Using Language Model with Sentence Similarity</Title>
  <Section position="6" start_page="3" end_page="3" type="concl">
    <SectionTitle>
5 Concluding Remarks
</SectionTitle>
    <Paragraph position="0"> In order to boost the translation quality of corpus-based MT systems for speech translation, the technique of splitting an input sentence appears promising. In previous research, many methods used N-gram clues to split sentences. To supplement N-gram based splitting methods, we introduce another clue using sentence similarity based on edit-distance. In our splitting method, we generate sentence-splitting candidates based on Ngrams, and select the best one by the measure of sentence similarity. The experimental results show that the method is valuable for two kinds of EBMT systems, one of which uses a phrase and the other of which uses a sentence as a translation unit.</Paragraph>
    <Paragraph position="1"> Although we used English-to-Japanese translation in the experiments, the method depends on no particular language. It can be applied to multi-lingual translation. Because the semantic distance used in the similarity definition did not show any significant effect, we need to find another factor to enhance the similarity measure. Furthermore, as future work, we'd like to make the splitting method cooperate with sentence simplification methods like (Siddharthan, 2002) in order to boost the translation quality much more.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML