File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-3254_intro.xml
Size: 6,226 bytes
Last Modified: 2025-10-06 14:02:50
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-3254"> <Title>Evaluating information content by factoid analysis: human annotation and stability</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Data and factoid annotation </SectionTitle> <Paragraph position="0"> We use two texts: a 600-word BBC report on the killing of the Dutch politician Pim Fortuyn (as used in van Halteren and Teufel (2003)), which contains a mix of factual information and personal reactions, and a 573-word article on the Iraqi invasion of Kuwait (used in DUC-2002, LA080290-0233).</Paragraph> <Paragraph position="1"> For these two texts, we collected human written generic summaries of roughly 100 words. Our guidelines asked the human subjects to formulate the summary in their own words, in order to elicit difierent linguistic expressions for the same information. Knowledge about the variability of expression is important both for evaluation and system building.</Paragraph> <Paragraph position="2"> The Fortuyn text was summarised by 40 Dutch students1, and 10 NLP researchers (native or near-native English speakers), resulting in a total of 50 summaries. For the Kuwait text, 1Another 20 summaries of the same source were removed due to insu-cient English or excessive length. we used the 6 DUC-provided summaries, 17 ELSNET-02 student participants (7 summaries removed), and summaries by 4 additional researchers, resulting in a total of 20 summaries. We use atomic semantic units called factoids to represent the meaning of a sentence. For instance, we represent the sentence \The police have arrested a white Dutch man&quot; by the union of the following factoids: Factoids are deflned empirically based on the data in the set of summaries we work with. Factoid deflnition starts with the comparison of the information contained in two summaries, and gets reflned (factoids get added or split) as incrementally other summaries are considered. If two pieces of information occur together in all summaries { and within the same sentence { they are treated as one factoid, because differentiation into more than one factoid would not help us in distinguishing the summaries. In our data, there must have been at least one summary that contained either only FP25 or only FP26 { otherwise those factoids would have been combined into a single factoid \FP27 The suspect is a Dutch man&quot;. Factoids are labelled with descriptions in natural language; initially, these are close in wording to the factoid's occurrence in the flrst summaries, though the annotator tries to identify and treat equally paraphrases of the factoid information when they occur in other summaries.</Paragraph> <Paragraph position="3"> Our deflnition of atomicity implies that the \amount&quot; of information associated with one factoid can vary from a single word to an entire sentence. An example for a large chunk of information that occurred atomically in our texts was the fact that Fortuyn wanted to become PM (FV71), a factoid which covers an entire sentence. On the other hand, a single word may break down into more than one factoids.</Paragraph> <Paragraph position="4"> If (together with various statements in other summaries) one summary contains \was killed&quot; and another \was shot dead&quot;, we identify the The flrst summary contains only the flrst two factoids, whereas the second contains all three. That way, the semantic similarity between related sentences can be expressed.</Paragraph> <Paragraph position="5"> When we identifled factoids in our summary collections, most factoids turned out to be independent of each other. But when dealing with naturally occuring documents many difflcult cases appear, e.g. ambiguous expressions, slight difierences in numbers and meaning, and inference.</Paragraph> <Paragraph position="6"> Another di-cult phenomenon is attribution.</Paragraph> <Paragraph position="7"> In both source texts, quotations of the reactions of several politicians and o-cials are given, and the subjects often generalised these reactions and produced statements such as \Dutch as well as international politicians have expressed their grief and disbelief.&quot; Due to coordination of speakers (in the subject) and coordination of reactions (in the direct object), it is hard to accurately represent the attribution of opinions. We therefore introduce combinatorical factoids, such as \ OG40 Politicians expressed grief&quot; and \OS62 International persons/organizations expressed disbelief&quot; which can be combined with similar factoids to express the above sentence.</Paragraph> <Paragraph position="8"> We wrote guidelines (10 pages long) which describe how to derive factoids from texts. The guidelines cover questions such as: how to create generalising factoids when numerical values vary (summaries might talk about \200&quot;, \about 200&quot; or \almost 200 Kuwaitis were killed&quot;), how to create factoids dealing with attribution of opinion, and how to deal with coordination of NPs in subject position, cataphors and other syntactic constructions. We believe that written guidelines should contain all the rules by which this process is done; this is the only way that other annotators, who do not have access to all the discussions the original annotators had, can replicate the annotation with a high agreement. We therefore consider the guidelines as one of the most valuable outcomes of this exercise, and we will make them and our annotated material generally available.</Paragraph> <Paragraph position="9"> The advantage of our empirical, summaryset-dependent deflnition of factoid atomicity is that the annotation is more objective than if factoids had to be invented by intuition of semantic constructions from scratch. One possible disadvantage of our deflnition of atomicity is that the set of factoids used may have to be adjusted if new summaries are judged, as a required factoid might be missing, or an existing one might require splitting. Using a large number of gold-standard summaries for the deflnition of factoids decreases the likelihood of this happening.</Paragraph> </Section> class="xml-element"></Paper>