File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-1610_concl.xml

Size: 3,059 bytes

Last Modified: 2025-10-06 13:53:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-1610">
  <Title>Optimizing Synonym Extraction Using Monolingual and Bilingual Resources</Title>
  <Section position="6" start_page="213" end_page="213" type="concl">
    <SectionTitle>
5 Discussions
</SectionTitle>
    <Paragraph position="0"> This paper uses three different methods and resources for synonym extraction. By using the corpus-based method, we can get some synonyms or near synonyms which can not be found in the hand-built thesauri. For Example: &amp;quot; handspring handstand&amp;quot; , &amp;quot; audiology otology&amp;quot; , &amp;quot; roisterer carouser&amp;quot; and &amp;quot; parmesan gouda&amp;quot; . These kinds of synonyms are difficult for hand-built thesauri to cover because they occur too infrequent to be caught by humans. In addition, this corpus-based method can get synonyms in specific domains while the general thesauri don't provide such fine-grained knowledge.</Paragraph>
    <Paragraph position="1"> Comparing the results with the human-built thesauri is not the best way to evaluate synonym extraction because the coverage of the human-built thesaurus is also limited. However, manually evaluating the results is time consuming. And it also cannot get the precise evaluation of the extracted synonyms. Although the human-built thesauri cannot help to precisely evaluate the results, they can still be used to detect the effectiveness of extraction methods.</Paragraph>
    <Paragraph position="2"> Conclusion This paper proposes a new method to extract synonyms from three resources: a monolingual dictionary, a bilingual corpus, and a large mono-lingual corpus. This method uses a weighted ensemble to combine all of the results of the individual extractors using one of the three resources respectively. Experimental results prove that the three resources are complementary to each other on synonym extraction, and that the ensemble method we used is very effective to improve both precisions and recalls when the results are compared with the manually-built thesauri WordNet and Roget.</Paragraph>
    <Paragraph position="3"> Further, we also propose a new method to extract synonyms from a bilingual corpus. This method uses the translations of a word to represent its meaning. The translation probabilities are trained with the bilingual corpus. The advantage of this method is that it can improve the coverage of the extracted synonyms. Experiments indicate that this method outperforms the other methods using a monolingual corpus or a monolingual dictionary.</Paragraph>
    <Paragraph position="4"> The contribution of this work lies in three aspects: (1) develop a method to combine the results of individual extractors using the three resources on synonym extraction; (2) investigate the performance of the three extraction methods using different resources, exposing the merits and demerits of each method; (3) propose a new method to extract synonyms from a bilingual corpus, which greatly improves the coverage of the extracted synonyms.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML