File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/95/w95-0114_concl.xml

Size: 1,080 bytes

Last Modified: 2025-10-06 13:57:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W95-0114">
  <Title>i Compiling Bilingual Lexicon Entries From a Non-Parallel English-Chinese Corpus</Title>
  <Section position="10" start_page="119421" end_page="119421" type="concl">
    <SectionTitle>
11 Conclusion
</SectionTitle>
    <Paragraph position="0"> We have shown the existence of statistical correlations between words and their translations even in a non-parallel corpus. Context heterogeneity is such a correlation feature. We have shown initial results of matching words with their translations in a English-Chinese non-parallel corpus by using context heterogeneity measures. Context heterogeneity can be used both as a clustering measure and a discrimination measure.</Paragraph>
    <Paragraph position="1"> Given two corresponding clusters of words from the corpus, context heterogeneity could be used to further divide and refine the clusters into few candidate translation words for a given word. Its results can be used to bootstrap or refine a bilingual lexicon compilation algorithm.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML