File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/p05-1040_concl.xml
Size: 1,571 bytes
Last Modified: 2025-10-06 13:54:44
<?xml version="1.0" standalone="yes"?> <Paper uid="P05-1040"> <Title>Detecting Errors in Discontinuous Structural Annotation</Title> <Section position="7" start_page="327" end_page="328" type="concl"> <SectionTitle> 6 Summary and Outlook </SectionTitle> <Paragraph position="0"> We have described the first method for finding errors in corpora with graph annotations. We showed how the variation n-gram method can be extended to discontinuous structural annotation, and how this can be done efficiently and with as high a precision as reported for continuous syntactic annotation. Our experiments with the TIGER corpus show that generalizing the context to part-of-speech tags increases recall while keeping precision above 50%.</Paragraph> <Paragraph position="1"> The method can thus have a substantial practical benefit when preparing a corpus with discontinuous annotation.</Paragraph> <Paragraph position="2"> Extending the error detection method to handle discontinuous constituents, as we have done, has significant potential for future work given the increasing number of free word order languages for which corpora and treebanks are being developed.</Paragraph> <Paragraph position="3"> Acknowledgements We are grateful to George Smith and Robert Langner of the University of Potsdam TIGER team for evaluating the variation we detected in the samples. We would also like to thank the three ACL reviewers for their detailed and helpful comments, and the participants of the OSU CLippers meetings for their encouraging feedback.</Paragraph> </Section> class="xml-element"></Paper>