File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/p06-2063_metho.xml

Size: 12,338 bytes

Last Modified: 2025-10-06 14:10:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-2063">
  <Title>Automatic Identification of Pro and Con Reasons in Online Reviews</Title>
  <Section position="5" start_page="483" end_page="485" type="metho">
    <SectionTitle>
3 Finding Pros and Cons
</SectionTitle>
    <Paragraph position="0"> This section describes our approach for finding pro and con sentences given a review text.</Paragraph>
    <Paragraph position="1"> We first collect data from epinions.com and automatically label each sentences in the data set.</Paragraph>
    <Paragraph position="2"> We then model our system using one of the machine learning techniques that have been successfully applied to various problems in Natural Language Processing. This section also describes features we used for our model.</Paragraph>
    <Section position="1" start_page="483" end_page="484" type="sub_section">
      <SectionTitle>
3.1 Automatically Labeling Pro and Con
Sentences
</SectionTitle>
      <Paragraph position="0"> Among many web sites that have product reviews such as amazon.com and epinions.com, some of them (e.g. epinions.com) explicitly state pros and cons phrases in their respective categories by each review's author along with the review text. First, we collected a large set of &lt;review text, pros, cons&gt; triplets from epin- null ions.com. A review document in epinions.com consists of a topic (a product model, restaurant name, travel destination, etc.), pros and cons (mostly a few keywords but sometimes complete sentences), and the review text. Our automatic labeling system first collects phrases in pro and con fields and then searches the main review text in order to collect sentences corresponding to those phrases. Figure 1 illustrates the automatic  pros and cons sentences in a review.</Paragraph>
      <Paragraph position="1"> The system first extracts comma-delimited phrases from each pro and con field, generating two sets of phrases: {P1, P2, ..., Pn} for pros and {C1, C2, ..., Cm} for cons. In the example in Figure 1, &amp;quot;beautiful display&amp;quot; can be P i and &amp;quot;not something you want to drop&amp;quot; can be C j . Then the system compares these phrases to the sentences in the text in the &amp;quot;Full Review&amp;quot;. For each phrase in {P1, P2, ..., Pn} and {C1, C2, ..., Cm}, the system checks each sentence to find a sentence that covers most of the words in the phrase. Then the system annotates this sentence with the appropriate &amp;quot;pro&amp;quot; or &amp;quot;con&amp;quot; label. All remaining sentences with neither label are marked as &amp;quot;neither&amp;quot;. After labeling all the epinion data, we use it to train our pro and con sentence recognition system.</Paragraph>
    </Section>
    <Section position="2" start_page="484" end_page="485" type="sub_section">
      <SectionTitle>
3.2 Modeling with Maximum Entropy
Classification
</SectionTitle>
      <Paragraph position="0"> We use Maximum Entropy classification for the task of finding pro and con sentences in a given review. Maximum Entropy classification has been successfully applied in many tasks in natural language processing, such as Semantic Role labeling, Question Answering, and Information Extraction.</Paragraph>
      <Paragraph position="1"> Maximum Entropy models implement the intuition that the best model is the one that is consistent with the set of constraints imposed by the evidence but otherwise is as uniform as possible (Berger et al., 1996). We modeled the conditional probability of a class c given a feature vector x as follows:</Paragraph>
      <Paragraph position="3"> Z is a normalization factor which can be calculated by the following:</Paragraph>
      <Paragraph position="5"> is a feature function which has a binary value, 0 or 1.</Paragraph>
      <Paragraph position="7"> and higher value of the weight indicates that ),( xcf i is an important feature for a class c . For our system development, we used</Paragraph>
      <Paragraph position="9"> which implements the above intuition.</Paragraph>
      <Paragraph position="10"> In order to build an efficient model, we separated the task of finding pro and con sentences into two phases, each being a binary classification. The first is an identification phase and the second is a classification phase. For this 2-phase model, we defined the 3 classes of c listed in  con candidate sentences (CR and PR in Table 1) from sentences irrelevant to either of them (NR). The classification task then classifies candidates into pros (PR) and cons (CR). Section 5 reports system results of both phases.</Paragraph>
    </Section>
    <Section position="3" start_page="485" end_page="485" type="sub_section">
      <SectionTitle>
3.3 Features
</SectionTitle>
      <Paragraph position="0"> The classification uses three types of features: lexical features, positional features, and opinion-bearing word features.</Paragraph>
      <Paragraph position="1"> For lexical features, we use unigrams, bigrams, and trigrams collected from the training set. They investigate the intuition that there are certain words that are frequently used in pro and con sentences which are likely to represent reasons why an author writes a review. Examples of such words and phrases are: &amp;quot;because&amp;quot; and &amp;quot;that's why&amp;quot;.</Paragraph>
      <Paragraph position="2"> For positional features, we first find paragraph boundaries in review texts using html tags such as &lt;br&gt; and &lt;p&gt;. After finding paragraph boundaries, we add features indicating the first, the second, the last, and the second last sentence in a paragraph. These features test the intuition used in document summarization that important sentences that contain topics in a text have certain positional patterns in a paragraph (Lin and Hovy, 1997), which may apply because reasons like pros and cons in a review document are most important sentences that summarize the whole point of the review.</Paragraph>
      <Paragraph position="3"> For opinion-bearing word features, we used pre-selected opinion-bearing words produced by a combination of two methods. The first method derived a list of opinion-bearing words from a large news corpus by separating opinion articles such as letters or editorials from news articles which simply reported news or events. The second method calculated semantic orientations of words based on WordNet  synonyms. In our previous work (Kim and Hovy, 2005), we demonstrated that the list of words produced by a combination of those two methods performed very well in detecting opinion bearing sentences. Both algorithms are described in that paper.</Paragraph>
      <Paragraph position="4"> The motivation for including the list of opinion-bearing words as one of our features is that pro and con sentences are quite likely to contain opinion-bearing expressions (even though some of them are only facts), such as &amp;quot;The waiting time was horrible&amp;quot; and &amp;quot;Their portion size of food was extremely generous!&amp;quot; in restaurant reviews. We presumed pro and con sentences containing only facts, such as &amp;quot;The battery lasted 3 hours, not 5 hours like they advertised&amp;quot;, would be captured by lexical or positional features.</Paragraph>
      <Paragraph position="5"> In Section 5, we report experimental results with different combinations of these features.</Paragraph>
      <Paragraph position="7"> Table 2 summarizes the features we used for our model and the symbols we will use in the rest of this paper.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="485" end_page="486" type="metho">
    <SectionTitle>
4 Data
</SectionTitle>
    <Paragraph position="0"> We collected data from two different sources: epinions.com and complaints.com</Paragraph>
    <Section position="1" start_page="485" end_page="486" type="sub_section">
      <SectionTitle>
(see Section
3.1 for details about review data in epinion.com).
</SectionTitle>
      <Paragraph position="0"> Data from epinions.com is mostly used to train the system whereas data from complaints.com is to test how the trained model performs on new data.</Paragraph>
      <Paragraph position="1"> Complaints.com includes a large database of publicized consumer complaints about diverse products, services, and companies collected for over 6 years. Interestingly, reviews in complaint.com are somewhat different from many other web sites which are directly or indirectly linked to Internet shopping malls such as amazon.com and epinions.com. The purpose of reviews in complaints.com is to share consumers' mostly negative experiences and alert businesses to customers feedback. However, many reviews in Internet shopping mall related reviews are positive and sometimes encourage people to buy more products or to use more services.</Paragraph>
      <Paragraph position="2"> Despite its significance, however, there is no hand-annotated data that we can use to build a system to identify reasons of complaints.com. In order to solve this problem, we assume that reasons in complaints reviews are similar to cons in other reviews and therefore if we are, somehow, able to build a system that can identify cons from  reviews, we can apply it to identify reasons in complaints reviews. Based on this assumption, we learn a system using the data from epinions.com, to which we can apply our automatic data labeling technique, and employ the resulting system to identify reasons from reviews in complaint.com. The following sections describe each data set.</Paragraph>
    </Section>
    <Section position="2" start_page="486" end_page="486" type="sub_section">
      <SectionTitle>
4.1 Dataset 1: Automatically Labeled Data
</SectionTitle>
      <Paragraph position="0"> We collected two different domains of reviews from epinions.com: product reviews and restaurant reviews. As for the product reviews, we collected 3241 reviews (115029 sentences) about mp3 players made by various manufacturers such as Apple, iRiver, Creative Lab, and Samsung.</Paragraph>
      <Paragraph position="1"> We also collected 7524 reviews (194393 sentences) about various types of restaurants such as family restaurants, Mexican restaurants, fast food chains, steak houses, and Asian restaurants. The average numbers of sentences in a review document are 35.49 and 25.89 respectively.</Paragraph>
      <Paragraph position="2"> The purpose of selecting one of electronics products and restaurants as topics of reviews for our study is to test our approach in two extremely different situations. Reasons why consumers like or dislike a product in electronics' reviews are mostly about specific and tangible features. Also, there are somewhat a fixed set of features of a specific type of product, for example, ease of use, durability, battery life, photo quality, and shutter lag for digital cameras. Consequently, we can expect that reasons in electronics' reviews may share those product feature words and words that describe aspects of features such as short or long for battery life. This fact might make the reason identification task easy.</Paragraph>
      <Paragraph position="3"> On the other hand, restaurant reviewers talk about very diverse aspects and abstract features as reasons. For example, reasons such as &amp;quot;You feel like you are in a train station or a busy amusement park that is ill-staffed to meet demand!&amp;quot;, &amp;quot;preferential treatment given to large groups&amp;quot;, and &amp;quot;they don't offer salads of any kind&amp;quot; are hard to predict. Also, they seem rarely share common keyword features.</Paragraph>
      <Paragraph position="4"> We first automatically labeled each sentence in those reviews collected from each domain with the features described in Section 3.1. We divided the data for training and testing. We then trained our model using the training set and tested it to see if the system can successfully label sentences in the test set.</Paragraph>
    </Section>
    <Section position="3" start_page="486" end_page="486" type="sub_section">
      <SectionTitle>
4.2 Dataset 2: Complaints.com Data
</SectionTitle>
      <Paragraph position="0"> From the database  in complaints.com, we searched for the same topics of reviews as Data-set 1: 59 complaints reviews about mp3 players and 322 reviews about restaurants  . We tested our system on this dataset and compare the results against human judges' annotation results. Subsection 5.2 reports the evaluation results.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML