File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/n04-1002_intro.xml
Size: 2,468 bytes
Last Modified: 2025-10-06 14:02:18
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-1002"> <Title>Cross-Document Coreference on a Large Scale Corpus</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2. Evaluation </SectionTitle> <Paragraph position="0"> Given a collection of named entities from documents, the coreferencing task is to put them into equivalence classes, where every mention in the same class refers to the same entity (person, location, organization, and so on). The classes are referred to as &quot;coreference chains&quot; because the entities are chained together.</Paragraph> <Paragraph position="1"> To evaluate the coreference chains emitted by a system, we need truth data: the chains of entities that are actually referring to the same person. Evaluation then proceeds by comparing the true chains to the system's hypothesized chains.</Paragraph> <Paragraph position="2"> We use the B-CUBED scoring algorithm (Bagga and Baldwin 1998) because it is the one used in the published research. The algorithm works as follows.</Paragraph> <Paragraph position="3"> For each entity mention e in the evaluation set, we first locate the truth chain TC that contains that mention (it can be in only one truth chain) and the system's hypothesized chain HC that contains it (again, there can Chung Heong Gooi and James Allan be only one hypothesis chain). We then compute a precision and recall score for those two chains.</Paragraph> <Paragraph position="4"> Precision is the proportion of mentions in HC that are also in TC and recall is the proportion of mentions in TC that are also in HC. If the chains match perfectly, recall and precision will both be one. If the hypothesis chain contains only the single mention e, then its precision will be one, and its recall will be 1/|TC|, the inverse of the size of the truth chain. Note that it is not possible to have a precision or recall of zero since entity e is always in common between the two chains. Our implementation of the B-CUBED algorithm is used specifically to evaluate an existing set of coreference chains and does not utilize any smoothing to handle system output which contains no entities.</Paragraph> <Paragraph position="5"> Overall precision and recall values are determined by averaging the individual values over all mentions in the evaluation set. These are the primary evaluation measures for cross-document coreference analysis.</Paragraph> </Section> class="xml-element"></Paper>