File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/p06-1142_concl.xml

Size: 1,297 bytes

Last Modified: 2025-10-06 13:55:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1142">
  <Title>Learning Transliteration Lexicons from the Web</Title>
  <Section position="8" start_page="1135" end_page="1135" type="concl">
    <SectionTitle>
6 Conclusions
</SectionTitle>
    <Paragraph position="0"> We have proposed a framework for harvesting E-C transliteration lexicons from the Web using bilingual snippets. In this framework, we formulate the PSM learning and E-C pair evaluation methods. We have studied three strategies for PSM learning aiming at reducing the human supervision.</Paragraph>
    <Paragraph position="1"> The experiments show that unsupervised learning is an effective way for rapid PSM adaptation while active learning is the most effective in achieving high performance. We find that the Web is a resourceful live corpus for real life E-C transliteration lexicon learning, especially for casual transliterations. In this paper, we use two Web databases SET1 and SET2 for simplicity. The proposed framework can be easily extended to an incremental learning framework for live databases. This paper has focused solely on use of phonetic clues for lexicon and PSM learning. We have good reason to expect the combining semantic and phonetic clues to improve the performance further.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML