File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/c04-1018_metho.xml
Size: 17,971 bytes
Last Modified: 2025-10-06 14:08:40
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1018"> <Title>Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 The Larger Problem and Related Work </SectionTitle> <Paragraph position="0"> This paper addresses the problem of identifying the hierarchical structure of perspective and speech expressions. We view this as a necessary and important component of a larger perspective-analysis amples, both types of pse appear in boldface. Note that the acronym 'pse' has been used previously with a different mean- null notes pse's with the writer as source. &quot;No parse&quot; denotes pse's in sentences where the parse failed, and so the part of speech could not be determined.</Paragraph> <Paragraph position="1"> number of pse's number of sentences system. Such a system would be able to identify all pse's in a document, as well as identify their structure. The system would also identify the direct source of each pse. Finally, the system would identify the text corresponding to the content of a private state or the speech expressed by a pse.3 Such a system might analyze sentence 2 as follows: (source: writer pse: (implicit speech event) content: Philip ... reasonable.&quot;) (source: clapp pse: sums up content: &quot;There ... reasonable.&quot;) (source: environmental movements pse: reaction content: (no text)) As far as we are aware, no single system exists that simultaneously solves all these problems. There is, however, quite a bit of work that addresses various pieces of this larger task, which we will now survey.</Paragraph> <Paragraph position="2"> Gerard (2000) proposes a computational model of the reader of a news article. Her model provides for multiple levels of hierarchical beliefs, such as the nesting of a primary source's belief within that of a reporter. However, Gerard does not provide algorithms for extracting this structure directly from newswire texts.</Paragraph> <Paragraph position="3"> Bethard et al. (2004) seek to extract propositional 3In (Wiebe, 2002), this is referred to as the inside. opinions and their holders. They define an opinion as &quot;a sentence, or part of a sentence that would answer the question 'How does X feel about Y?' &quot; A propositional opinion is an opinion &quot;localized in the propositional argument&quot; of certain verbs, such as &quot;believe&quot; or &quot;realize&quot;. Their task then corresponds to identifying a pse, its associated direct source, and the content of the private state. However, they consider as pse's only verbs, and further restrict attention to verbs with a propositional argument, which is a subset of the perspective and speech expressions that we consider here. Table 1, for example, shows the diversity of word classes that correspond to pse's in our corpus. Perhaps more importantly for the purposes of this paper, their work does not address information filtering issues, i.e. problems that arise when an opinion has been filtered through multiple sources. Namely, Bethard et al. (2004) do not consider sentences that contain multiple pse's, and do not, therefore, need to identify any indirect sources of opinions. As shown in Table 2, however, we find that sentences with multiple non-writer pse's (i.e. sentences that contain 3 or more total pse's) comprise a significant portion (29.98%) of our corpus. An advantage over our work, however, is that Bethard et al. (2004) do not require separate solutions to pse identification and the identification of their direct sources.</Paragraph> <Paragraph position="4"> Automatic identification of sources has also been addressed indirectly by Gildea and Jurafsky's (2002) work on semantic role identification in that finding sources often corresponds to finding the filler of the agent role for verbs. Their methods then might be used to identify sources and associate them with pse's that are verbs or portions of verb phrases. Whether their work will also apply to pse's that are realized as other parts of speech is an open question. Wiebe (1994), studies methods to track the change of &quot;point of view&quot; in narrative text (fiction). That is, the &quot;writer&quot; of one sentence may not correspond to the writer of the next sentence. Although this is not as frequent in newswire text as in fiction, it will still need to be addressed in a solution to the larger problem.</Paragraph> <Paragraph position="5"> Bergler (1993) examines the lexical semantics of speech event verbs in the context of generative lexicon theory. While not specifically addressing our problem, the &quot;semantic dimensions&quot; of reporting verbs that she extracts might be very useful as features in our approach.</Paragraph> <Paragraph position="6"> Finally, Wiebe et al. (2003) present preliminary results for the automatic identification of perspective and speech expressions using corpus-based techniques. While the results are promising (66% F- null the Collins parser.</Paragraph> <Paragraph position="7"> measure), the problem is still clearly unsolved. As explained below, we will instead rely on manually tagged pse's for the studies presented here.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 The Approach </SectionTitle> <Paragraph position="0"> Our task is to find the hierarchical structure among the pse's in individual sentences. One's first impression might be that this structure should be obvious from the syntax: one pse should filter another roughly when it dominates the other in a dependency parse. This heuristic, for example, would succeed for &quot;claim&quot; and &quot;unhappy&quot; in sentence 1, whose pse structure is given in Figure 1 and parse structure (as produced by the Collins parser) in Figure 2. 4 Even in sentence 1, though, we can see that the problem is more complex: &quot;angry&quot; dominates &quot;claim&quot; in the parse tree, but does not filter it. Unfortunately, an analysis of the parse-based heuristic on our training data (the data set will be described in Section 4), uncovered numerous, rather than just a few, sources of error. Therefore, rather than trying to handcraft a more complex collection of heuristics, we chose to adopt a supervised machine learning approach that relies on features identified in this analysis. In particular, we will first train a binary classifier to make pairwise decisions as to whether a given pse is the immediate parent of another. We then use a simple approach to combine these decisions to find the hierarchical information-filtering structure of all pse's in a sentence.</Paragraph> <Paragraph position="1"> We assume that we have a training corpus of 4For this heuristic and the features that follow, we will speak of the pse's as if they had a position in the parse tree. However, since pse's are often multiple words, and do not necessarily form a constituent, this is not entirely accurate. The parse node corresponding to a pse will be the highest node in the dependency parse corresponding to a word in the pse. We consider the writer's implicit pse to correspond to the root of the parse. sentences, annotated with pse's and their hierarchical pse structure (Section 4 describes the corpus). Training instances for the binary classifier are pairs of pse's from the same sentence, <psetarget,pseparent> 5. We assign a class value of 1 to a training instance if pseparent is the immediate parent of psetarget in the manually annotated hierarchical structure for the sentence, and 0 otherwise. For sentence 1, there are nine training instances generated: <claim,writer> , <angry,writer> , <unhappy,claim> (class 1), <claim,angry> , <claim,unhappy> , <angry,claim> , <angry,unhappy> , <unhappy,writer> , <unhappy,angry> (class 0). The features used to describe each training instance are explained below.</Paragraph> <Paragraph position="2"> During testing, we construct the hierarchical pse structure of an entire sentence as follows. For each pse in the sentence, ask the binary classifier to judge each other pse as a potential parent, and choose the pse with the highest confidence6. Finally, join these immediate-parent links to form a tree.7 One might also try comparing pairs of potential parents for a given pse, or other more direct means of ranking potential parents. We chose what seemed to be the simplest method for this first attempt at the problem.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 Features </SectionTitle> <Paragraph position="0"> Here we motivate and describe the 23 features used in our model. Unless otherwise stated, all features are binary (1 if the described condition is true, 0 otherwise).</Paragraph> <Paragraph position="1"> Parse-based features (6). Based on the performance of the parse-based heuristic, we include a pseparent-dominates-psetarget feature in our feature set. To compensate for parse errors, however, we also include a variant of this that is 1 if the parent of pseparent dominates psetarget.</Paragraph> <Paragraph position="2"> Many filtering expressions filter pse's that occur in their complements, but not in adjuncts. Therefore, we add variants of the previous two syntax-based features that denote whether the parent node dictions might not be a tree (i.e. it might be cyclic and disconnected). Since this occurs very rarely (5 out of 9808 sentences on the test data), we do not attempt to correct any non-tree graphs.</Paragraph> <Paragraph position="3"> dominates psetarget, but only if the first dependency relation is an object relation.</Paragraph> <Paragraph position="4"> For similar reasons, we include a feature calculating the domination relation based on a partial parse. Consider the following sentence: 3. He was criticized more than recognized for his policy.</Paragraph> <Paragraph position="5"> One of &quot;criticized&quot; or &quot;recognized&quot; will be the root of this dependency parse, thus dominating the other, and suggesting (incorrectly) that it filters the other pse. Because a partial parse does not attach all constituents, such spurious dominations are eliminated. The partial parse feature is 1 for fewer instances than pseparent-dominates-psetarget, but it is more indicative of a positive instance when it is 1.</Paragraph> <Paragraph position="6"> So that the model can adjust when the parse is not present, we include a feature that is 1 for all instances generated from sentences on which the parser failed.</Paragraph> <Paragraph position="7"> Positional features (5). Forcing the model to decide whether pseparent is the parent of psetarget without knowledge of the other pse's in the sentence is somewhat artificial. We therefore include several features that encode the relative position of pseparent and psetarget in the sentence. Specifically, we add a feature that is 1 if pseparent is the root of the parse (and similarly for psetarget ). We also include a feature giving the ordinal position of pseparent among the pse's in the sentence, relative to psetarget (-1 means pseparent is the pse that immediately precedes psetarget, 1 means immediately following, and so forth). To allow the model to vary when there are more potential parents to choose from, we include a feature giving the total number of pse's in the sentence.</Paragraph> <Paragraph position="8"> Special parents and lexical features (6). Some particular pse's are special, so we specify indicator features for four types of parents: the writer pse, and the lexical items &quot;said&quot; (the most common non-writer pse) and &quot;according to&quot;. &quot;According to&quot; is special because it is generally not very high in the parse, but semantically tends to filter everything else in the sentence.</Paragraph> <Paragraph position="9"> In addition, we include as features the part of speech of pseparent and psetarget (reduced to noun, verb, adjective, adverb, or other), since intuitively we expected distinct parts of speech to behave differently in their filtering.</Paragraph> <Paragraph position="10"> Genre-specific features (6). Finally, journalistic writing contains a few special forms that are not always parsed accurately. Examples are: 4. &quot;Alice disagrees with me,&quot; Bob argued.</Paragraph> <Paragraph position="11"> 5. Charlie, she noted, dislikes Chinese food.</Paragraph> <Paragraph position="12"> The parser may not recognize that &quot;noted&quot; and &quot;argued&quot; should dominate all other pse's in sentences 4 and 5, so we attempt to recognize when a sentence falls into one of these two patterns.</Paragraph> <Paragraph position="13"> For <disagrees,argued> generated from sentence 4, features pseparent-pattern-1 and psetarget-pattern1 would be 1, while for <dislikes,noted> generated from sentence 5, feature pseparent-pattern-2 would be 1. We also add features that denote whether the pse in question falls between matching quote marks.</Paragraph> <Paragraph position="14"> Finally, a simple feature indicates whether pseparent is the last word in the sentence.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 Resources </SectionTitle> <Paragraph position="0"> We rely on a variety of resources to generate our features. The corpus (see Section 4) is distributed with annotations for sentence breaks, tokenization, and part of speech information automatically generated by the GATE toolkit (Cunningham et al., 2002).8 For parsing we use the Collins (1999) parser.9 For partial parses, we employ CASS (Abney, 1997). Finally, we use a simple finite-state recognizer to identify (possibly nested) quoted phrases.</Paragraph> <Paragraph position="1"> For classifier construction, we use the IND package (Buntine, 1993) to train decision trees (we use the mml tree style, a minimum message length criterion with Bayesian smoothing).</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Data Description </SectionTitle> <Paragraph position="0"> The data for these experiments come from version</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.1 of the NRRC corpus (Wiebe et al., 2002).10. The </SectionTitle> <Paragraph position="0"> corpus consists of 535 newswire documents (mostly from the FBIS), of which we used 66 (1375 sentences) for developing the heuristics and features, while keeping the remaining 469 (9808 sentences) blind (used for 10-fold cross-validation).</Paragraph> <Paragraph position="1"> Although the NRRC corpus provides annotations for all pse's, it does not provide annotations to denote directly their hierarchical structure within a 8GATE's sentences sometimes extend across paragraph boundaries, which seems never to be warranted. Inaccurately joining sentences has the effect of adding more noise to our problem, so we split GATE's sentences at paragraph boundaries, and introduce writer pse's for the newly created sentences. null 9We convert the parse to a dependency format that makes some of our features simpler using a method similar to the one described in Xia and Palmer (2001). We also employ a method from Adam Lopez at the University of Maryland to find grammatical relationships between words (subject, object, etc.). 10The original corpus is available at http: //nrrc.mitre.org/NRRC/Docs_Data/MPQA_ 04/approval_mpqa.htm. Code and data used in our experiments are available at http://www.cs.cornell.</Paragraph> <Paragraph position="2"> edu/~ebreck/breck04playing/.</Paragraph> <Paragraph position="3"> sentence. This structure must be extracted from an attribute of each pse annotation, which lists the pse's direct and indirect sources. For example, the &quot;source chain&quot; for &quot;unhappy&quot; in sentence 1, would be (writer, Alice, Bob). The source chains allow us to automatically recover the hierarchical structure of the pse's: the parent of a pse with source chain (s0,s1,...sn[?]1,sn) is the pse with source chain (s0,s1,...sn[?]1). Unfortunately, ambiguities can arise. Consider the following sentence: 6. Bob said, &quot;you're welcome&quot; because he was glad to see that Mary was happy.</Paragraph> <Paragraph position="4"> Both &quot;said&quot; and &quot;was glad&quot; have the source chain (writer, Bob),11 while &quot;was happy&quot; has the source chain (writer, Bob, Mary). It is therefore not clear from the manual annotations whether &quot;was happy&quot; should have &quot;was glad&quot; or &quot;said&quot; as its parent. 5.82% of the pse's have ambiguous parentage (i.e.</Paragraph> <Paragraph position="5"> the recovery step finds a set of parents P(pse) with |P(pse) |> 1). For training, we assign a class value of 1 to all instances <pse,par> ,par [?] P(pse). For testing, if an algorithm attaches pse to any element of P(pse), we score the link as correct (see Section 5.1). Since ultimately our goal is to find the sources through which information is filtered (rather than the pse's), we believe this is justified.</Paragraph> <Paragraph position="6"> For training and testing, we used only those sentences that contain at least two non-writer pse's12 - for all other sentences, there is only one way to construct the hierarchical structure. Again, Table 2 presents a breakdown (for the test set) of the number of pse's per sentence - thus we only use approximately one-third of all the sentences in the corpus.</Paragraph> </Section> </Section> class="xml-element"></Paper>