File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-3234_concl.xml
Size: 1,831 bytes
Last Modified: 2025-10-06 13:54:32
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-3234"> <Title>Trained Named Entity Recognition Using Distributional Clusters</Title> <Section position="6" start_page="0" end_page="0" type="concl"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> There are several ways in which this work might be extended and improved, both in its particular form and in general: a0 BWI models initial and terminal boundaries, but ignores characteristics of the extracted phrase other than its length. We are exploring mechanisms for modeling relevant phrasal structure.</Paragraph> <Paragraph position="1"> a0 While global statistical approaches, such as sequential averaged perceptrons or CRFs (Mc-Callum and Li, 2003), appear better suited to the NER problem than local symbolic learners, the two approaches search different hypothesis spaces. Based on the surmise that, by combining them, we can realize improvements over either in isolation, we are exploring mechanisms for integration.</Paragraph> <Paragraph position="2"> a0 The distributional clusters we find are independent of the problem to which we want to apply them and may sometimes be inappropriate or have the wrong granularity. We are exploring ways to produce groupings that are sensitive to the task at hand.</Paragraph> <Paragraph position="3"> Our results clearly establish that an unsupervised distributional analysis of a text corpus can produce features that lead to enhanced precision and, especially, recall in information extraction. We have successfully used these features in lieu of domainspecific, labor-intensive resources, such as syntactic analysis and special-purpose gazetteers. Distributional analysis, combined with light supervision, is an effective, stable alternative to bootstrapping methods.</Paragraph> </Section> class="xml-element"></Paper>