File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-2716_concl.xml

Size: 1,186 bytes

Last Modified: 2025-10-06 13:55:48

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2716">
  <Title>Layering and Merging Linguistic Anotations</Title>
  <Section position="7" start_page="89" end_page="89" type="concl">
    <SectionTitle>
6 Conclusion
</SectionTitle>
    <Paragraph position="0"> The ANC has implemented an efficient pipeline for the processing of text into a corpus of machine usable documents. For some document types this process is almost completely automated and can be regarded as a Corpus-Builderin-a Box: raw data goes in one end, and a fuly annotated corpus with standoff annotations comes out the other.</Paragraph>
    <Paragraph position="1"> The use of standoff annotations allows for an accurate representation of the ANC data as provided by the contributors and allows us to easily provide several modular annotation sets that can be included or excluded by the end user as desired. By providing a SAX like parser for ANC documents, we are able to leverage a number of available XML tols without the restrictions imposed by an XML representation of the documents. For users who are not interested in XML or standoff annotations, the plain text version is preserved.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML