File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/03/w03-0432_abstr.xml
Size: 827 bytes
Last Modified: 2025-10-06 13:43:02
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-0432"> <Title>Named Entity Recognition Using a Character-based Probabilistic Approach</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> We present a named entity recognition and classification system that uses only probabilistic character-level features. Classifications by multiple orthographic tries are combined in a hidden Markov model framework to incorporate both internal and contextual evidence. As part of the system, we perform a preprocessing stage in which capitalisation is restored to sentence-initial and all-caps words with high accuracy. We report f-values of 86.65 and 79.78 for English, and 50.62 and 54.43 for the German datasets.</Paragraph> </Section> class="xml-element"></Paper>