File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-3304_concl.xml

Size: 2,904 bytes

Last Modified: 2025-10-06 13:55:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3304">
  <Title>Integrating Ontological Knowledge and Textual Evidence in Estimating Gene and Gene Product Similarity</Title>
  <Section position="6" start_page="29" end_page="30" type="concl">
    <SectionTitle>
5 Evaluation
</SectionTitle>
    <Paragraph position="0"> Table 4 summarizes the results for both strategies, comparing Spearman rank correlations between BBS and the models from the fusion and regression approaches with Spearman rank correlations between BBS and XOA alone. Note that the latter correlations are lower than the one reported in Table 2 due to the small size of our sample (1% of the original data set, as pointed out above). P-values associated with the changes in the correlation values are also reported, enclosed in parentheses.</Paragraph>
    <Paragraph position="1">  between BLAST bit score BBS and XOA, BBS and the fusion model, and BBS and the regression model. P-values for the differences between the augmented models and XOA alone are given in parentheses.</Paragraph>
    <Paragraph position="2"> An important finding from Table 4 is that integrating text-based evidence in the semantic similarity measures systematically improves the relationships between BLAST and XOA. Not surprisingly, the fusion models yield smaller improvements. However, these improvements in the order of 3% for the Resnik and Lin variants are very encouraging, even though they are not statistically significant. The regression models, on the other hand, provide larger and statistically significant improvements, reinforcing our hypothesis that textual evidence complements the GO-based similarity measures. We expect that a more sophisticated NLP treatment of textual evidence will yield significant improvements even for the more interpretable fusion models.</Paragraph>
    <Section position="1" start_page="29" end_page="30" type="sub_section">
      <SectionTitle>
Conclusions and Further Work
</SectionTitle>
      <Paragraph position="0"> Our early results show that literature evidence provides a significant contribution, even using very simple Information Extraction and integration methods such as those described in this paper. The employment of more sophisticated Information  Extraction tools and integration techniques is therefore likely to bring higher gains.</Paragraph>
      <Paragraph position="1"> Further work using GoPubMed involves factoring in the accuracy percentage which related extracted terms to their induced GO categories and capturing complex phrases (e.g. signal transduction, fat protein). We also intend to compare the advantages provided by the GoPubMed term extraction process with Information Extraction tools created for the biomedical domain such as Medstract (Pustejovsky et al. 2002), and develop a methodology for integrating a variety of Information Extraction processes into XOA.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML