File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/p03-1010_concl.xml
Size: 1,356 bytes
Last Modified: 2025-10-06 13:53:36
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-1010"> <Title>Reliable Measures for Aligning Japanese-English News Articles and Sentences</Title> <Section position="10" start_page="0" end_page="0" type="concl"> <SectionTitle> 8 Conclusion </SectionTitle> <Paragraph position="0"> We have proposed two measures for extracting valid article and sentence alignments. The measure for article alignment uses similarities in sentences aligned by DP matching and that for sentence alignment uses similarities in articles aligned by CLIR. They enhance each other and allow valid article and sentence alignments to be reliably extracted from an extremely noisy Japanese-English parallel corpus.</Paragraph> <Paragraph position="1"> We are distributing the alignment data discussed in this paper so that it can be used for research and educational purposes. It has attracted the attention of people both inside and outside the NLP community.</Paragraph> <Paragraph position="2"> We have applied our measures to a Japanese and English bilingual corpus and these are language independent. It is therefore reasonable to expect that they can be applied to any language pair and still retain good performance, particularly since their effectiveness has been demonstrated in such a disparate language pair as Japanese and English.</Paragraph> </Section> class="xml-element"></Paper>