File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-3310_abstr.xml

Size: 1,127 bytes

Last Modified: 2025-10-06 13:45:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3310">
  <Title>Exploring Text and Image Features to Classify Images in Bioscience Literature</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> A picture is worth a thousand words.</Paragraph>
    <Paragraph position="1"> Biomedical researchers tend to incorporate a significant number of images (i.e., figures or tables) in their publications to report experimental results, to present research models, and to display examples of biomedical objects. Unfortunately, this wealth of information remains virtually inaccessible without automatic systems to organize these images. We explored supervised machine-learning systems using Support Vector Machines to automatically classify images into six representative categories based on text, image, and the fusion of both. Our experiments show a significant improvement in the average F-score of the fusion classifier (73.66%) as compared with classifiers just based on image (50.74%) or text features (68.54%).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML