File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/n06-3009_abstr.xml

Size: 1,939 bytes

Last Modified: 2025-10-06 13:44:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-3009">
  <Title>A Hybrid Approach to Biomedical Named Entity Recognition and Semantic Role Labeling</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper, we describe our hybrid approach to two key NLP technologies: biomedical named entity recognition (Bio-NER) and (Bio-SRL). In Bio-NER, our system successfully integrates linguistic features into the CRF framework. In addition, we employ web lexicons and template-based post-processing to further boost its performance. Through these broad linguistic features and the nature of CRF, our system outperforms state-of-the-art machine-learning-based systems, especially in the recognition of protein names (F=78.5%). In Bio-SRL, first, we construct a proposition bank on top of the popular biomedical GENIA treebank following the PropBank annotation scheme.</Paragraph>
    <Paragraph position="1"> We only annotate the predicate-argument structures (PAS's) of thirty frequently used biomedical verbs (predicates) and their corresponding arguments. Second, we use our proposition bank to train a biomedical SRL system, which uses a maximum entropy (ME) machine-learning model. Thirdly, we automatically generate argument-type templates, which can be used to improve classification of biomedical argument roles. Our experimental results show that a newswire English SRL system that achieves an F-score of 86.29% in the newswire English domain can maintain an F-score of 64.64% when ported to the biomedical domain.</Paragraph>
    <Paragraph position="2"> By using our annotated biomedical corpus, we can increase that F-score by 22.9%.</Paragraph>
    <Paragraph position="3"> Adding automatically generated template features further increases overall F-score by 0.47% and adjunct (AM) F-score by 1.57%, respectively.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML