File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1018_intro.xml

Size: 5,569 bytes

Last Modified: 2025-10-06 14:02:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1018">
  <Title>Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Newswire text has long been a primary target for natural language processing (NLP) techniques such as information extraction, summarization, and question answering (e.g. MUC (1998); NIS (2003); DUC (2003)). However, newswire does not offer direct access to facts, events, and opinions; rather, journalists report what they have experienced, and report on the experiences of others. That is, facts, events, and opinions are filtered by the point of view of the writer and other sources. Unfortunately, this filtering of information through multiple sources (and multiple points of view) complicates the natural language interpretation process because the reader (human or machine) must take into account the biases introduced by this indirection. It is important for understanding both newswire and narrative text (Wiebe, 1994), therefore, to appropriately recognize expressions of point of view, and to associate them with their direct and indirect sources.</Paragraph>
    <Paragraph position="1"> This paper introduces two kinds of expression that can filter information. First, we define a perspective expression to be the minimal span of text that denotes the presence of an explicit opinion, evaluation, emotion, speculation, belief, sentiment, etc.1 Private state is the general term typically used 1Note that implicit expressions of perspective, i.e. Wiebe et to refer to these mental and emotional states that cannot be directly observed or verified (Quirk et al., 1985). Further, we define the source of a perspective expression to be the experiencer of that private state, that is, the person or entity whose opinion or emotion is being conveyed in the text. Second, speech expressions simply convey the words of another individual - and by the choice of words, the reporter filters the original source's intent. Consider for example, the following sentences (in which perspective expressions are denoted in bold, speech expressions are underlined, and sources are denoted in italics):  1. Charlie was angry at Alice's claim that Bob was unhappy.</Paragraph>
    <Paragraph position="2"> 2. Philip Clapp, president of the National Environment Trust, sums up well the general thrust of the  reaction of environmental movements: &amp;quot;There is no reason at all to believe that the polluters are suddenly going to become reasonable.&amp;quot; Perspective expressions in Sentence 1 describe the emotions or opinion of three sources: Charlie's anger, Bob's unhappiness, and Alice's belief. Perspective expressions in Sentence 2, on the other hand, introduce the explicit opinion of one source, i.e. the reaction of the environmental movements.</Paragraph>
    <Paragraph position="3"> Speech expressions also perform filtering in these examples. The reaction of the environmental movements is filtered by Clapp's summarization, which, in turn, is filtered by the writer's choice of quotation. In addition, the fact that Bob was unhappy is filtered through Alice's claim, which, in turn, is filtered by the writer's choice of words for the sentence. Similarly, it is only according to the writer that Charlie is angry.</Paragraph>
    <Paragraph position="4"> The specific goal of the research described here is to accurately identify the hierarchical structure of perspective and speech expressions (pse's) in text.2 al.'s (2003) &amp;quot;expressive subjective elements&amp;quot; are not the subject of study here.</Paragraph>
    <Paragraph position="5"> 2For the rest of this paper, then, we ignore the distinction between perspective and speech expressions, so in future ex-Given sentences 1 and 2 and their pse's, for example, we will present methods that produce the structures shown in Figure 1, which represent the multi-stage information filtering that should be taken into account in the interpretation of the text.</Paragraph>
    <Paragraph position="6">  speech expressions in sentences 1 and 2 We propose a supervised machine learning approach to the problem that relies on a small set of syntactically-based features. More specifically, the method first trains a binary classifier to make pairwise parent-child decisions among the pse's in the same sentence, and then combines the decisions to determine their global hierarchical structure. We compare the approach to two heuristic-based baselines -- one that simply assumes that every pse is filtered only through the writer, and a second that is based on syntactic dominance relations in the associated parse tree. In an evaluation using the opinion-annotated NRRC corpus (Wiebe et al., 2002), the learning-based approach achieves an accuracy of 78.30%, significantly higher than both the simple baseline approach (65.57%) and the parse-based baseline (71.64%). We believe that this study provides a first step towards understanding the multi-stage filtering process that can bias and garble the information present in newswire text.</Paragraph>
    <Paragraph position="7"> The rest of the paper is organized as follows. We present related work in Section 2 and describe the machine learning approach in Section 3. The experimental methodology and results are presented in Sections 4 and 5, respectively. Section 6 summarizes our conclusions and plans for future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML