File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/a97-1051_concl.xml
Size: 1,911 bytes
Last Modified: 2025-10-06 13:57:46
<?xml version="1.0" standalone="yes"?> <Paper uid="A97-1051"> <Title>Mixed-Initiative Development of Language Processing Systems</Title> <Section position="9" start_page="354" end_page="354" type="concl"> <SectionTitle> 9. Conclusions </SectionTitle> <Paragraph position="0"> On the basis of observing our own and others' experiences in building and porting natural language systems for new domains, we have come to appreciate the pivotal role played in continuous evaluation throughout the system development cycle. But evaluation rests on an oracle, and for text processing, that oracle is the training and test corpora for a particular task. This has led us to develop a tailoring environment which focuses all of the available knowledge on accelerating the corpus development process. The very same learning procedure that is used to bootstrap the manual tagging process leads eventually to the derivation of tagging heuristics that can be applied in the operational setting to unseen documents. Rules derived manually, automatically, and through a combination of efforts have been applied successfully in a variety of languages, including English, Spanish, Portuguese, Japanese and Chinese.</Paragraph> <Paragraph position="1"> The tailoring environment, known as the Alembic Workbench, has been built and used within our organization, and we are making it available to other organizations involved in the development of language processing systems and/or annotated corpora. Initial experiments indicate an significant improvement in the rate at which annotated corpora can be generated using the Alembic Workbench methodology. Earlier work has shown that with the training dat~ obtained in the course of only a couple of hours of text annotation, an information extraction system can be induced purely automatically that achieves a very competitive level of performance.</Paragraph> </Section> class="xml-element"></Paper>