File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/02/w02-0303_abstr.xml

Size: 1,003 bytes

Last Modified: 2025-10-06 13:42:31

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0303">
  <Title>Contrast And Variability In Gene Names</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> We studied contrast and variability in a corpus of gene names to identify potential heuristics for use in performing entity identification in the molecular biology domain. Based on our findings, we developed heuristics for mapping weakly matching gene names to their official gene names.</Paragraph>
    <Paragraph position="1"> We then tested these heuristics against a large body of Medline abstracts, and found that using these heuristics can increase recall, with varying levels of precision. Our findings also underscored the importance of good information retrieval and of the ability to disambiguate between genes, proteins, RNA, and a variety of other referents for performing entity identification with high precision.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML