XML Viewer - w05-0802

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/w05-0802_concl.xml

Size: 1,354 bytes

Last Modified: 2025-10-06 13:54:57

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0802">
  <Title>Cross language Text Categorization by acquiring Multilingual Domain Models from Comparable Corpora</Title>
  <Section position="10" start_page="15" end_page="15" type="concl">
    <SectionTitle>
7 Conclusion
</SectionTitle>
    <Paragraph position="0"> In this paper we proposed a solution to cross language Text Categorization based on acquiring Multilingual Domain Models from comparable corpora in a totally unsupervised way and without using any external knowledge source (e.g. bilingual dictionaries). These Multilingual Domain Models are exploited to de ne a generalized similarity function (i.e. a kernel function) among documents in different languages, which is used inside a Support Vector Machines classi cation framework. The basis of the similarity function exploits the presence of common words to induce a second-order similarity for the other words in the lexicons. The results have shown that this technique is suf cient to capture relevant aspects of topic similarity in cross-language TC tasks, obtaining substantial improvements over a simple baseline. As future work we will investigate the performance of this approach to more than two languages TC task, and a possible generalization of the assumption about equality of the common words.</Paragraph>
  </Section>
class="xml-element"></Paper>

Download Original XML