File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/n04-2002_abstr.xml
Size: 949 bytes
Last Modified: 2025-10-06 13:43:32
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-2002"> <Title>Identifying Chemical Names in Biomedical Text: An Investigation of the Substring Co-occurrence Based Approaches</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> We investigate various strategies for finding chemicals in biomedical text using substring co-occurrence information. The goal is to build a system from readily available data with minimal human involvement. Our models are trained from a dictionary of chemical names and general biomedical text.</Paragraph> <Paragraph position="1"> We investigated several strategies including Naive Bayes classifiers and several types of N-gram models. We introduced a new way of interpolating N-grams that does not require tuning any parameters. We also found the task to be similar to Language Identification.</Paragraph> </Section> class="xml-element"></Paper>