File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/c04-1164_metho.xml

Size: 5,555 bytes

Last Modified: 2025-10-06 14:08:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1164">
  <Title>Automated Alignment and Extraction of Bilingual Ontology for Cross-Language Domain-Specific Applications</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Evaluation
</SectionTitle>
    <Paragraph position="0"> For evaluation, a medical domain ontology is constructed. A medical web mining system is also implemented to evaluate the practicability of the bilingual ontology.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Conceptual Evaluation of Ontology
</SectionTitle>
      <Paragraph position="0"> The benchmark ontologies are created to be the test-suites of reusable data which can be employed by ontology engineers or constructer for benchmarking purposes. The benchmark ontology was constructed by the domain experts including two doctors and one pharmacologist based on UMLS. The domain experts have integrated the Chinese concepts without changing the contents of</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
UMLS
</SectionTitle>
    <Paragraph position="0"> Evaluation of ontology construction adopted the two layer measures: Lexical and Conceptual layers (Eichmann et al. 1998). The evaluation in the conceptual layer seems to be more important than that in the lexical layer when the ontology is constructed by aligning or merging several well defined source ontologies. There are two conceptual relation categories for evaluation: Taxonomic and non-Taxonomic evaluations.</Paragraph>
    <Paragraph position="1"> 3.1.1 Evaluation of the taxonomic relation Step1 Linearization: This step decomposes the tree structure into the vertex list as described in Section  Step 2 Normalization: Since the frequencies of concepts in the vertex list set are not equal, the normalization factors are introduced to address this problem. For the target ontology, the factor vectors for normalization is</Paragraph>
    <Paragraph position="3"> and for the benchmark ontology is</Paragraph>
    <Paragraph position="5"> nf is the normalization factor for the i-th concept of the ontology O. It is defined as the reciprocal of the frequency in the vertex list set.</Paragraph>
    <Paragraph position="7"> Step 3 Estimation of the vertex list similarity: Therefore, the pairwise similarity of these two vertex lists of the target ontology and benchmark ontology can be obtained using the Needleman/Wunsch techniques shown in the following steps: Initialization: Create a matrix with m+1 columns and n+1 rows. m and n are the numbers of the concepts in the vertex lists of the target ontology and the bench mark ontology, respectively. The first row and first column of the matrix can be initially set to 0. That is, (,) 0, m 0 n 0 Sim m n if or= == (6) Matrix filling: Assign the values to the remnant elements in the matrix as the following equation:  There are some synonyms belonging to the same concept represented in one vertex. So the lexicon similarity can be described as  Step 4 Pairwise similarity matrix estimation: The pairwise similarity matrix is obtained after p qx times for Step3. p ,q are the numbers of the vertex list of target ontology and benchmark ontology. Each element of the pairwise similarity matrix as Equation (10) is obtained using Equation  Some relations defined in the ontology are non-taxonomic set such as synonym. In fact, the lexicon similarity is applied to measure the conceptual similarity. The lexicon similarity of set can be defined as the following equation: (, ) Words defined in the and Words defined in the or  Using the benchmark ontology and evaluation metrics described in previous sections, the evaluation results are shown in Table 1.</Paragraph>
    <Paragraph position="8"> Table1 the similarity measure between the target ontology and benchmark ontology Taxonomic relation similarity 0.57 Non-Taxonomic relation similarity 0.68 According to the experimental results, some phenomena are discovered as follows: first, the number of words mapped to the same concept in the upper layer of ontology is larger than that in the lower layer because the terminologies usually appear in the lower layer.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Evaluation of domain application
</SectionTitle>
      <Paragraph position="0"> To assess the ontology performance, a medical web-mining system to search the desired page has been implemented. In this system the web pages were collected from several Websites and totally 2322 web pages for medical domain and 8133 web pages for contrastive domain were collected. The training and test queries for training and evaluating the system performance were also collected. Forty users, who do not take part in the system development, were asked to provide a set of queries given the collected web pages. After postprocessing, the duplicate queries and the queries out of the medical domain are removed. Finally, 3207 test queries using natural language were obtained.</Paragraph>
      <Paragraph position="1"> The baseline system is based on the Vector-Space Model (VSM) and synonym expansion. The conceptual relations and axioms defined in the medical ontology are integrated into the baseline as the ontology-based system. The result is shown in Table 2. The results show that ontology-based system outperforms the baseline system with synonym expansion, especially in recall rate.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML