File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-0304_intro.xml

Size: 4,725 bytes

Last Modified: 2025-10-06 14:03:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0304">
  <Title>User-directed Sentiment Analysis: Visualizing the Affective Content of Documents</Title>
  <Section position="4" start_page="0" end_page="23" type="intro">
    <SectionTitle>
2 Background
</SectionTitle>
    <Paragraph position="0"> At the AAAI Symposium on Attitude and Affect held at Stanford in 2004 (Qu et al., 2005), it was clear that the lexical approach to capturing affect was adequate for broad brush results, but there were no production quality visualizations for presenting those results analytically. Thus, we began exploring methods and tools for the visualization of lexically-based approaches for measuring affect which could facilitate the exploration of affect within a text collection.</Paragraph>
    <Section position="1" start_page="0" end_page="23" type="sub_section">
      <SectionTitle>
2.1 Affect Extraction
</SectionTitle>
      <Paragraph position="0"> Following the general methodology of informational retrieval, there are two pre-dominant methods for identifying sentiment in text: Text classification models and lexical approaches.</Paragraph>
      <Paragraph position="1"> Classification models require that a set of documents are hand labeled for affect, and a system is  trained on the feature vectors associated with labels. New text is automatically classified by comparing the feature vectors with the training set. (Pang &amp; Lee, 2004; Aue &amp; Gamon, 2005).</Paragraph>
      <Paragraph position="2"> This methodology generally requires a large amount of training data and is domain dependent. In the lexical approach, documents (Turney &amp; Littman, 2003), phrases (see Wilson et al., 2005), or sentences (Weibe &amp; Riloff, 2005) are categorized as positive or negative, for example, based on the number of words in them that match a lexicon of sentiment bearing terms. Major drawbacks of this approach include the contextual variability of sentiment (what is positive in one domain may not be in another) and incomplete coverage of the lexicon. This latter drawback is often circumvented by employing bootstrapping (Turney &amp; Littman, 2003; Weibe &amp; Riloff, 2005) which allows one to create a larger lexicon from a small number of seed words, and potentially one specific to a particular domain.</Paragraph>
    </Section>
    <Section position="2" start_page="23" end_page="23" type="sub_section">
      <SectionTitle>
2.2 Affect Visualization
</SectionTitle>
      <Paragraph position="0"> The uses of automatic sentiment classification are clear (public opinion, customer reviews, product analysis, etc.). However, there has not been a great deal of research into ways of visualizing affective content in ways that might aid data exploration and the analytic process.</Paragraph>
      <Paragraph position="1"> There are a number of visualizations designed to reveal the emotional content of text, in particular, text that is thought to be highly emotively charged such as conversational transcripts and chat room transcripts (see DiMicco et al., 2002; Tat &amp; Carpendale, 2002; Lieberman et al., 2004; Wang et al., 2004, for example). Aside from using color and emoticons to explore individual documents (Liu et al., 2003) or email inboxes (Mandic &amp; Kerne, 2004), there are very few visualizations suitable for exploring the affect of large collections of text. One exception is the work of Liu et al. (2005) in which they provide a visualization tool to compare reviews of products,using a bar graph metaphor. Their system automatically extracts product features (with associated affect) through parsing and pos tagging, having to handle exceptional cases individually.</Paragraph>
      <Paragraph position="2"> Their Opinion Observer is a powerful tool designed for a single purpose: comparing customer reviews.</Paragraph>
      <Paragraph position="3"> In this paper, we introduce a visual analytic tool designed to explore the emotional content of large collections of open domain documents. The tools described here work with document collections of all sizes, structures (html, xml, .doc, email, etc), sources (private collections, web, etc.), and types of document collections. The visualization tool is a mature tool that supports the analytical process by enabling users to explore the thematic content of the collection, use natural language to query the collection, make groups, view documents by time, etc. The ability to explore the emotional content of an entire collection of documents not only enables users to compare the range of affect in documents within the collection, but also allows them to relate affect to other dimensions in the collection, such as major topics and themes, time, and source.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML