File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/02/w02-1015_abstr.xml
Size: 731 bytes
Last Modified: 2025-10-06 13:42:38
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-1015"> <Title>Handling noisy training and testing data</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> In the eld of empirical natural language processing, researchers constantly deal with large amounts of marked-up data; whether the markup is done by the researcher or someone else, human nature dictates that it will have errors in it. This paper will more fully characterise the problem and discuss whether and when (and how) to correct the errors. The discussion is illustrated with speci c examples involving function tagging in the Penn treebank.</Paragraph> </Section> class="xml-element"></Paper>