File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/p98-2220_concl.xml
Size: 3,650 bytes
Last Modified: 2025-10-06 13:58:09
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2220"> <Title>Automatic English-Chinese name transliteration for development of multilingual resources</Title> <Section position="7" start_page="1355" end_page="1355" type="concl"> <SectionTitle> 6 Conclusions and Future Extensions </SectionTitle> <Paragraph position="0"> The algorithm we have outlined is being implemented as a tool for the creation of Chinese lexical resources within a multilingual text generation project from an English-language source database. We focused on the requirements of the domain of English place names. The algorithm is currently being extended to include personal name transliteration as well, which requires a different set of characters. A personal name transliteration standard has been developed and is in use in China (Chanzhong Wu, p.c.). By mapping the Pinyin transliterations arrived by our algorithm to this different set of characters, we can extend the domain to include personal names.</Paragraph> <Paragraph position="1"> In its present form, the algorithm will not always generate transliterations matching those which might be produced by a human transliterator due to the influence of historical factors or individual differences. However, the aim of the algorithm is to produce a transliteration understandable by readers of a Chinese text. While the algorithm mimics the intuitive superimposition of phonemic and phonotactic systems, the ultimate goals of the algorithm are generality and reliability. Indeed, the result from the example above corresponds to a standard transliteration. Thus the algorithm produces results which are recognisable. The degree to which the transliteration is recognised by the human speaker is dependent in part on the length of the original name. Longer names with many syllables are less recognisable than shorter names. The introduced phonemic conversion rules are merely those most common and further work will strengthen the generality of the tool.</Paragraph> <Paragraph position="2"> Further research will include a more formal analysis of the correspondences between English and Chinese phonemes. Furthermore, the algorithm is far from robust due to its current limited focus, and errors made in earlier stages are propagated and possibly magnified as the algorithm continues. Since place names and people's names originate from many cultures, this algorithm will not produce desirable results unless the written form exhibits some assimilation to English spelling. We are currently investigating the application of lazy learning techniques (as described by van den Bosch 1997) to learning the English naming word-phoneme correspondences from a corpus of names. Such a module could eventually replace our simplistic rule-based procedure, and could feed into the phoneme-Pinyin mapping module, ultimately resulting in greater accuracy.</Paragraph> <Paragraph position="3"> The applications of such an algorithm are countless. Currently, the process of finding a less common country, city, or county name is an arduous procedure. Because transliteration uses no semantic content, it is a obvious task for automation. This algorithm could also be applied in the character entry on a Chinese word processor or to index Chinese electronic atlases.</Paragraph> <Paragraph position="4"> When attached to a robust grapheme-to-phoneme module, the transliteration into Chinese characters is ultimately a mapping to Chinesespecific IPA phonetics, raising the possibility of speech synthesis of English names in Chinese, gwen that Pinyin is a phonemically normalized orthography.</Paragraph> </Section> class="xml-element"></Paper>