File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/03/w03-0802_abstr.xml

Size: 1,353 bytes

Last Modified: 2025-10-06 13:43:05

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0802">
  <Title>WHAT: An XSLT-based Infrastructure for the Integration of Natural Language Processing Components</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> The idea of the Whiteboard project is to integrate deep and shallow natural language processing components in order to benefit from their synergy.</Paragraph>
    <Paragraph position="1"> The project came up with the first fully integrated hybrid system consisting of a fast HPSG parser that utilizes tokenization, PoS, morphology, lexical, named entity, phrase chunk and (for German) topological sentence field analyses from shallow components. This integration increases robustness, directs the search space and hence reduces processing time of the deep parser. In this paper, we focus on one of the central integration facilities, the XSLT-based Whiteboard Annotation Transformer (WHAT), report on the benefits of XSLT-based NLP component integration, and present examples of XSL transformation of shallow and deep annotations used in the integrated architecture. The infrastructure is open, portable and well suited for, but not restricted to the development of hybrid NLP architectures as well as NLP applications.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML