File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/n04-1037_intro.xml
Size: 2,244 bytes
Last Modified: 2025-10-06 14:02:18
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-1037"> <Title>The (Non)Utility of Predicate-Argument Frequencies for Pronoun Interpretation</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Corpora Used </SectionTitle> <Paragraph position="0"> The training and test data sets came from the newspaper and newswire segments of the Automatic Content Extraction (ACE) program corpus. The training data contained 2773 annotated third-person pronouns, and the test data (the February 2002 evaluation set) contained 762 annotated third-person pronouns. The performance statistics on the test data reported here are from the only time an evaluation with this data was performed; progress during development was estimated solely via jackknifing on the training data.</Paragraph> <Paragraph position="1"> The annotated pronouns included only those that were ACE &quot;markables&quot;, i.e., ones that referred to entities of the following types: Persons, Organizations, GeoPoliticalEntities (politically defined 2The difference amounted to 9 additional correct predictions in a corpus of 360 examples. They express a belief that the improvement is real, but acknowledge that they would need twice as many examples in their corpus to reach statistical significance.</Paragraph> <Paragraph position="2"> geographical regions, their governments, or their people), Locations, and Facilities. Thus, there were pronouns in both the development and (presumably) test sets for which there were no annotations. As such, certain problems that real-world systems face, such as non-referential (e.g., 'pleonastic') pronouns and pronouns that refer to eventualities, did not have to be dealt with. (However, these pronouns were possible antecedents to other pronouns, and thus were sometimes mistakenly selected as the correct antecedent.) Thus, our results are not necessarily comparable to those of a system that deals with these difficulties (although previous work varies afairbitonhowtheirdatasetswerefilteredinthisregard). Our main purpose here is to establish a state-of-the-art baseline with which to assess the contribution of predicate-argument frequency information.</Paragraph> </Section> class="xml-element"></Paper>