File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/02/w02-1015_abstr.xml

Size: 731 bytes

Last Modified: 2025-10-06 13:42:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-1015">
  <Title>Handling noisy training and testing data</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In the eld of empirical natural language processing, researchers constantly deal with large amounts of marked-up data; whether the markup is done by the researcher or someone else, human nature dictates that it will have errors in it. This paper will more fully characterise the problem and discuss whether and when (and how) to correct the errors. The discussion is illustrated with speci c examples involving function tagging in the Penn treebank.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML