File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-1311_intro.xml

Size: 5,150 bytes

Last Modified: 2025-10-06 14:06:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1311">
  <Title>Event Coreference for Information Extraction</Title>
  <Section position="2" start_page="0" end_page="75" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Much recent work on anaphora has concentrated on coreference between objects referred to by noun phrases or pronouns (see, e.g., Botley and McEnery (1997)). But coreference involving events, expressed via verbs or nominalised verb forms, is also common, and can play an important role in practical applications of natural language processing (NLP) systems.</Paragraph>
    <Paragraph position="1"> One application area of increasing interest is information extraction (IE) (see, e.g., Cowie and Lehnert (1996)). Information extraction systems attempt to fill predefined template structures with information extracted from short natural language texts, such as newswire articles. The prototypical IE tasks are those specified in the Message Understanding Conference (MUC) evaluations (DARPA, 1995; Grishman and Sundheim, 1996). In these exercises the main template filling task centres around a 'scenario' which is defined in terms of a key event type and various roles pertaining to it. Examples of scenarios used in previous MUCs include joint venture announcements, microprocessor product announcements, terrorist attacks, labour negotiations, and management succession events. In order not to spuriously overgenerate event instances and to properly acquire all available role information, it is crucial that multiple references to the same event be correctly identified and merged. While these concerns are of central importance to IE systems, they are clearly of significance for any NLP system, and more broadly for any computational model of natural language.</Paragraph>
    <Paragraph position="2"> A few concrete examples will make the issues clearer 1. A management succession event (as used in MUC-6) may involve the two separate events of a corporate position being vacated by one person and then filled by another. For an event to be considered reportable for the IE task, the post, the company and at least one person (either incoming or outgoing) must all be identifiable in the text.</Paragraph>
    <Paragraph position="3"> The first thing to note here is that while management succession events are sometimes reported as single, simple events, as in (1) Mr. Jones succeeds M. James Bird, 50, as president off Wholistic Therapy.</Paragraph>
    <Paragraph position="4"> more frequently multiple aspects or sub-events of a single succession event are identified in separate clauses by separate verb phrases or nominalised forms: (2) Daniel Wood was named president and chief executive officer off EFC Records Group, a unit off London's Spear EFC PLC. He succeeds Charles Paulson, who was recently made chairman and chief executive officer off EFC Records Group  (3) The sell-o# followed the resignation late Monday o\] Freddie Heller, the president o/ Renard Broadcasting Co. Yesterday, Renard named Susan B. Kempham, chairman o/ Renard Inc. 's television production arm, to suc null ceed him.</Paragraph>
    <Paragraph position="5"> Both of these pairs of sentences refer to a single management succession event (though the second sentence in 2 also identifies a further one). Such event/sub-event relations are similar to the familiar part-whole or related-object anaphora exemplified in sentences such as The airplane crashed a~ter the wings/ell off or When John entered the kitchen the stove was on (Allen, 1987).</Paragraph>
    <Paragraph position="6"> The second thing to note is the variety of surface forms used to refer to events. Events are referred to by verb phrases in main clauses (1 above), and in relative clauses (second sentence in 2) or subordinate clauses. They may be referred to through nominalised forms (resignation in 3 above) or through infinitival forms in control sentences (second sentence in 3). When there are multiple references to the same event, antecedent and anaphor appear to be able to adopt all combinations of these forms 2.</Paragraph>
    <Paragraph position="7"> This paper discusses an approach to handling event coreference as implemented in the LaSIE information extraction system (Gaizauskas et al., 1995; Gaizauskas and Humphreys, 1997b). Within this system, event coreference is handled as a natural extension to object coreference, outlined here and described in detail in Gaizauskas and Humphreys (1997a). Both mechanisms are handled within a general approach to discourse and world modelling.</Paragraph>
    <Paragraph position="8"> In the next section we give a brief overview of the LaSIE system. Section 3 describes in more detail the approach to world and discourse modelling within LaSIE and Section 4 details our coreference procedure. In Section 5 we discuss a particular example in detail and show how our approach enables us to correctly corefer multiple event references. Section 6 presents results of an approach to evaluating the the approach and Section 7 concludes the paper with some general discussion.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML