File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/98/w98-1118_abstr.xml
Size: 1,187 bytes
Last Modified: 2025-10-06 13:49:34
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-1118"> <Title>Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper describes a novel statistical named-entity (i.e. &quot;proper name&quot;) recognition system built around a maximum entity framework. By working v,ithin the framework of maximum entropy theory and utilizing a flexible object-based architecture, the system is able to make use of an extraordinarily diverse range of knowledge sources in making its tagging decisions. These knowledge sources include capitalization features, lexical features, features indicating the current section of text (i.e. headline or main body), and dictionaries of single or multi-word terms. The purely statistical system contains no hand-generated patterns and achieves a result comparable with the best statistical systems. However, when combined with other handcoded systems, the system achieves scores that exceed the highest comparable scores thus-far published.</Paragraph> </Section> class="xml-element"></Paper>