File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1618_intro.xml
Size: 2,593 bytes
Last Modified: 2025-10-06 14:04:01
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1618"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Identification of Event Mentions and their Semantic Class</Title> <Section position="4" start_page="0" end_page="146" type="intro"> <SectionTitle> 2 Related Efforts </SectionTitle> <Paragraph position="0"> Such aspectual distinctions have been alive and well in the linguistic literature since at least the late 60s (Vendler, 1967). However, the use of the term event in natural language processing work has often diverged quite considerably from this linguistic notion. In the Topic Detection and Tracking (TDT) task, events were sets of documents that described &quot;some unique thing that happens at some point in time&quot; (Allan et. al., 1998). In the Message Understanding Conference (MUC), events were groups of phrases that formed a template relating participants, times and places to each other (Marsh and Perzanowski, 1997). In the work of Filatova and Hatzivassiloglou (2003), events consisted of a verb and two named-entities occurring together frequently across several documents on a topic.</Paragraph> <Paragraph position="1"> Several recent efforts have stayed close to the linguistic definition of events. One such example is the work of Siegel and McKeown (2000) which showed that machine learning models could be trained to identify some of the traditional linguistic aspectual distinctions. They manually annotated the verbs in a small set of texts as either state or event, and then used a variety of linguistically motivated features to train machine learning models that were able to make the event/state distinction with 93.9% accuracy.</Paragraph> <Paragraph position="2"> Another closely related effort was the Evita system, developed by Sauri et. al. (2005). This work considered a corpus of events called TimeBank, whose annotation scheme was motivated largely by the linguistic definitions of events. Sauri et. al. showed that a linguistically motivated and mainly rule-based algorithm could perform well on this task.</Paragraph> <Paragraph position="3"> Our work draws from both the Siegel and McKeown and Sauri et. al. works. We consider the same TimeBank corpus as Sauri et. al., but apply a statistical machine learning approach akin to that of Siegel and McKeown. We demonstrate that combining machine learning techniques with linguistically motivated features can produce models from the TimeBank data that are capable of making a variety of subtle aspectual distinctions.</Paragraph> </Section> class="xml-element"></Paper>