XML Viewer - w06-1651

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1651_intro.xml
Size: 5,999 bytes
Last Modified: 2025-10-06 14:03:58
<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1651">
  <Title>Joint Extraction of Entities and Relations for Opinion Recognition</Title>
  <Section position="4" start_page="431" end_page="432" type="intro">
    <SectionTitle>
2 High-Level Approach and Related
</SectionTitle>
    <Paragraph position="0"> Work Our system operates in three phases. Opinion and Source Entity Extraction We begin by developing two separate token-level sequence-tagging classifiers for opinion expression extraction and source extraction, using linear-chain Conditional Random Fields (CRFs) (Lafferty et al., 2001). The sequence-tagging classifiers are trained using only local syntactic and lexical information to extract each type of entity without knowledge of any nearby or neighboring entities or relations. We collect n-best sequences from each sequence tagger in order to boost the recall of the final system.</Paragraph>
    <Paragraph position="1"> Link Relation Classification We also develop a relation classifier that is trained and tested on all pairs of opinion and source entities extracted from the aforementioned n-best opinion expression and source sequences. The relation classifier is modeled using Markov order-0 CRFs(Lafferty 2Wiebe et al. (2005) reports human annotation agreement for opinion expression as 82.0 by F1 measure.</Paragraph>
    <Paragraph position="2">  et al., 2001), which are equivalent to maximum entropy models. It is trained using only local syntactic information potentially useful for connecting a pair of entities, but has no knowledge of nearby or neighboring extracted entities and link relations. Integer Linear Programming Finally, we formulate an integer linear programming problem for each sentence using the results from the previous two phases. In particular, we specify a number of soft and hard constraints among relations and entities that take into account the confidence values provided by the supporting entity and relation classifiers, and that encode a number of heuristics to ensure coherent output. Given these constraints, global inference via ILP finds the optimal, coherent set of opinion-source pairs by exploiting mutual dependencies among the entities and relations. While good performance in entity or relation extraction can contribute to better performance of the final system, this is not always the case. Punyakanok et al. (2004) notes that, in general, it is better to have high recall from the classifiers included in the ILP formulation. For this reason, it is not our goal to directly optimize the performance of our opinion and source entity extraction models or our relation classifier.</Paragraph>
    <Paragraph position="3"> The rest of the paper is organized as follows.</Paragraph>
    <Paragraph position="4"> Related work is outlined below. Section 3 describes the components of the first phase of our system, the opinion and source extraction classifiers. Section 4 describes the construction of the link relation classifier for phase two. Section 5 describes the ILP formulation to perform global inference over the results from the previous two phases. Experimental results that compare our ILP approach to a number of baselines are presented in Section 6. Section 7 describes how SRL can be incorporated into our global inference system to further improve the performance. Final experimental results and discussion comprise Section 8.</Paragraph>
    <Paragraph position="5"> Related Work The definition of our sourceexpresses-opinion task is similar to that of Bethard et al. (2004); however, our definition of opinion and source entities are much more extensive, going beyond single sentences and propositional opinion expressions. In particular, we evaluate our approach with respect to (1) a wide variety of opinion expressions, (2) explicit and implicit3 sources, (3) multiple opinion-source link relations 3Implicit sources are those that are not explicitly mentioned. See Section 8 for more details.</Paragraph>
    <Paragraph position="6"> per sentence, and (4) link relations that span more than one sentence. In addition, the link relation model explicitly exploits mutual dependencies among entities and relations, while Bethard et al. (2004) does not directly capture the potential influence among entities.</Paragraph>
    <Paragraph position="7"> Kim and Hovy (2005b) and Choi et al. (2005) focus only on the extraction of sources of opinions, without extracting opinion expressions.</Paragraph>
    <Paragraph position="8"> Specifically, Kim and Hovy (2005b) assume a priori existence of the opinion expressions and extract a single source for each, while Choi et al.</Paragraph>
    <Paragraph position="9"> (2005) do not explicitly extract opinion expressions nor link an opinion expression to a source even though their model implicitly learns approximations of opinion expressions in order to identify opinion sources. Other previous research focuses only on the extraction of opinion expressions (e.g.</Paragraph>
    <Paragraph position="10"> Kim and Hovy (2005a), Munson et al. (2005) and Wilson et al. (2005)), omitting source identification altogether.</Paragraph>
    <Paragraph position="11"> There have also been previous efforts to simultaneously extract entities and relations by exploiting their mutual dependencies. Roth and Yih (2002) formulated global inference using a Bayesian network, where they captured the influence between a relation and a pair of entities via the conditional probability of a relation, given a pair of entities. This approach however, could not exploit dependencies between relations. Roth and Yih (2004) later formulated global inference using integer linear programming, which is the approach that we apply here. In contrast to our work, Roth and Yih (2004) operated in the domain of factual information extraction rather than opinion extraction, and assumed that the exact boundaries of entities from the gold standard are known a priori, which may not be available in practice.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML