XML Viewer - w03-0433

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-0433_intro.xml
Size: 2,042 bytes
Last Modified: 2025-10-06 14:01:56
<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0433">
  <Title>Kowloon</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> We describe multiple stacking and voting methods that effectively combine strong classifiers such as boosting, SVM, and TBL, for the named-entity recognition (NER) task. NER has emerged as an important step for many natural language applications, including machine translation, information retrieval and information extraction. Much of the research in this field was pioneered in the Message Understanding Conference (MUC) (Sundheim, 1995), which performed detailed entity extraction and identification on English documents. As a result, most current NER systems which have impressive performances have been specially constructed and tuned for English MUC-style documents. It is unclear how well they would perform when applied to another language.</Paragraph>
    <Paragraph position="1"> Our system was designed for the CoNLL-2003 shared task, the goal of which is to identify and classify four types of named entities: PERSON, LOCATION, ORGA-NIZATION and MISCELLANEOUS. The task specifications were that two languages would be involved. We *The author would like to thank the Hong Kong Research Grants Council (RGC) for supporting this research in part through two research grants (RGC 6083/99E and RGC 6256/00E).</Paragraph>
    <Paragraph position="2"> were given about a month to develop our system on the first language, which was English, but only two weeks to adapt it to the surprise language, which was German.</Paragraph>
    <Paragraph position="3"> Given the goal of the shared task, we designed our system to achieve a high performance without relying too heavily on knowledge that is very specific for a particular language or domain. In the spirit of languageindependence, we avoided using features and information which would not be easily obtainable for almost any major language.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML