File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/95/e95-1010_concl.xml
Size: 1,530 bytes
Last Modified: 2025-10-06 13:57:23
<?xml version="1.0" standalone="yes"?> <Paper uid="E95-1010"> <Title>Text Alignment in the Real World: Improving Alignments of Noisy Translations Using Common Lexical Features, String Matching Strategies and N-Gram Comparisons ~</Title> <Section position="9" start_page="73" end_page="73" type="concl"> <SectionTitle> 8 Conclusions </SectionTitle> <Paragraph position="0"> In the real world, poor-quality translations are common due to the preferences of individual translators, lack of formal format guidelines for translations and outright mistakes. Our method combines four feature scores into a simple measure of the probability of two textual segments aligning. The algorithm is fairly general in that all of the feature scores used are more or less applicable to a wide range of Spanish and English translations and are also applicable to a degree to other European languages. It is further likely that the methods we used can improve alignments between many non-European languages by exploiting the increasingly common English phrases and Arabic number occurrences in professional and public communications throughout the world.</Paragraph> <Paragraph position="1"> Our alignment algorithm presents a new formulation of Bayesian methods combined with a direct approach to data fusion for multiple sources of information. This approach should work well with a wide range of data sources, including direct comparisons of co-occurrence probabilities for specific classes of lexical elements.</Paragraph> </Section> class="xml-element"></Paper>