File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-0306_metho.xml
Size: 20,244 bytes
Last Modified: 2025-10-06 14:10:34
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-0306"> <Title>Searching for Sentences Expressing Opinions by using Declaratively Subjective Clues</Title> <Section position="4" start_page="39" end_page="42" type="metho"> <SectionTitle> 2 Declaratively Subjective Clues </SectionTitle> <Paragraph position="0"> Declaratively subjective clues are a basic criterion for judging whether a sentence expresses an opinion. We extracted the declaratively subjective clues from Japanese sentences that evaluators judged to be opinions.</Paragraph> <Section position="1" start_page="39" end_page="40" type="sub_section"> <SectionTitle> 2.1 Opinion-expressing Sentence Judgment </SectionTitle> <Paragraph position="0"> We regard a sentence to be &quot;opinion expressing&quot; if it explicitly declares the writer's idea or belief at a sentence level. We define as a &quot;declaratively subjective clue&quot;, the part of a sentence that contributes to explicitly conveying the writer's idea or belief in the opinion-expressing sentence. For example, &quot;I am glad&quot; in the sentence &quot;I am glad to see you&quot; can convey the writer's pleasure to a reader, so we regard the sentence as an &quot;opinion-expressing sentence&quot; and &quot;I am glad&quot; as a &quot;declaratively subjective clue.&quot; Another example of a declaratively subjective clue is the exclamation mark in the sentence &quot;We got a contract!&quot; It conveys the writer's emotion about the event to a reader.</Paragraph> <Paragraph position="1"> If a sentence only describes something abstract or concrete even though it has word-level or phrase-level subjective parts, we do not consider it to be opinion expressing. On the other hand, some word-level or phrase-level subjective parts can be declaratively subjective clues depending on where they occur in the sentence.</Paragraph> <Paragraph position="2"> Consider the following two sentences.</Paragraph> <Paragraph position="3"> (1) This house is beautiful.</Paragraph> <Paragraph position="4"> (2) We purchased a beautiful house.</Paragraph> <Paragraph position="5"> Both (1) and (2) contain the word-level subjec- null tive part &quot;beautiful&quot;. Our criterion would lead us to say that sentence (1) is an opinion, because &quot;beautiful&quot; is placed in the predicate part and (1) is considered to declare the writer's evaluation of the house to a reader. This is why &quot;beautiful&quot; in (1) is eligible as a declaratively subjective clue. On the other hand, sentence (2) is not judged to contain an opinion, because &quot;beautiful&quot; is placed in the noun phrase, i.e., the object of the verb &quot;purchase,&quot; and (2) is considered to report the event of the house purchase rather ob- null jectively to a reader. Sentence (2) partially contains subjective information about the beauty of the house; however this information is unlikely to be what a writer wants to emphasize. Thus, &quot;beautiful&quot; in (2) does not work as a declaratively subjective clue.</Paragraph> <Paragraph position="6"> These two sentences illustrate the fact that the presence of a subjective word (&quot;beautiful&quot;) does not unconditionally assure that the sentence expresses an opinion. Additionally, these examples do suggest that sentences containing an opinion can be judged depending on where such word-level or phrase-level subjective parts as evaluative adjectives are placed in the predicate part. Some word-level or phrase-level subjective parts such as subjective sentential adverbs can be declaratively subjective clues depending on where they occur in the sentence. In sentence (3), &quot;amazingly&quot; expresses the writer's feeling about the event. Sentence (3) is judged to contain an opinion because there is a subjective sentential adverb in its main clause.</Paragraph> <Paragraph position="7"> (3) Amazingly, few people came to my party.</Paragraph> <Paragraph position="8"> The existence of some idiomatic collocations in the main clause also affects our judgment as to what constitutes an opinion-expressing sentence. For example, sentence (4) can be judged as expressing an opinion because it includes &quot;my wish is&quot;.</Paragraph> <Paragraph position="9"> (4) My wish is to go abroad.</Paragraph> <Paragraph position="10"> Thus, depending on the type of declaratively subjective clue, it is necessary to consider where the expression is placed in the sentence to judge whether the sentence is an opinion.</Paragraph> </Section> <Section position="2" start_page="40" end_page="42" type="sub_section"> <SectionTitle> 2.2 Clue Expression Collection </SectionTitle> <Paragraph position="0"> We collected declaratively subjective clues in opinion-expressing sentences from Japanese web pages. Figure 1 illustrates the flow of collection of eligible expressions.</Paragraph> <Paragraph position="1"> type query's topic Product cell phone, car, beer, cosmetic Entertainment sports, movie, game, animation First, we retrieved Japanese web pages from forty queries covering a wide range of topics such as products, entertainment, facilities, and phenomena, as shown in Table 1. We used queries on various topics because we wanted to acquire declaratively subjective clues for open-domain opinion web searches. Most of the queries contain proper nouns. These queries correspond to possible situations in which a user wants to retrieve opinions from web pages about a particular topic, such as &quot;Cell phone X,&quot; &quot;Y museum,&quot; and &quot;Football coach Z's ability&quot;, where X, Y, and Z are proper nouns.</Paragraph> <Paragraph position="2"> Next, opinion-expressing sentences were extracted from the top twenty retrieved web pages in each query, 800 pages in total. There were 75,575 sentences in these pages.</Paragraph> <Paragraph position="3"> Three evaluators judged whether each sentence contained an opinion or not. The 13,363 sentences judged to do so by all three evaluators were very likely to be opinion expressing. The number of sentences which three evaluators agreed on as non-opinion expressing was 42,346.</Paragraph> <Paragraph position="4"> Out of the 13,363 opinion expressing sentences, 8,425 were then used to extract declaratively subjective clues and learn positive examples in a Support Vector Machine (SVM), and 4,938 were used to assess the performance of opinion expressing sentence search (Section 4). Out of the 42,346 non-opinion sentences, 26,340 were used to learn negative examples, and 16,006 were used to assess, keeping the number ratio of the positive and negative example sentences in learning and assessing.</Paragraph> <Paragraph position="5"> One analyst extracted declaratively subjective clues from 8,425 of the 13,363 opinion-expressing sentences, and another analyst checked the result. The number of declaratively Note that not all of these opinion-expressing sentences retrieved were closely related to the query because some of the pages described miscellaneous topics.</Paragraph> <Paragraph position="6"> subjective clues obtained was 2,936. These clues were classified into fourteen types as shown in Table 2, where the underlined expressions in example sentences are extracted as declaratively subjective clues. The example sentences in Table 2 are Japanese opinion-expressing sentences and their English translations. Although some English counterparts of Japanese clue expressions might not be cogent because of the characteristic difference between Japanese and English, the clue types are likely to be language-independent. We can see that various types of expressions compose opinion-expressing sentences.</Paragraph> <Paragraph position="7"> As mentioned in Section 2.1, it is important to check where a declaratively subjective clue appears in the sentence in order to apply our criterion of whether the sentence is an opinion or not. The clues in the types other than (b), (c) and (l) usually appear in the predicate part of a main clause.</Paragraph> <Paragraph position="8"> The declaratively subjective clues in Japanese examples are placed in the rear parts of sentences except in types (b), (c) and (l). This reflects the heuristic rule that Japanese predicate parts are in principle placed in the rear part of a sentence.</Paragraph> </Section> </Section> <Section position="5" start_page="42" end_page="43" type="metho"> <SectionTitle> 3 Opinion-Sentence Extraction </SectionTitle> <Paragraph position="0"> In this section, we explain the method of classifying each sentence by using declaratively subjective clues.</Paragraph> <Paragraph position="1"> The simplest method for automatically judging whether a sentence is an opinion is a rule-based one that extracts sentences that include declaratively subjective clues. However, as mentioned in Section 2, the existence of declaratively subjective clues does not assure that the sentence expresses an opinion. It is a daunting task to write rules that describe how each declaratively subjective clue should appear in an opinion-expressing sentence. A more serious problem is that an insufficient collection of declaratively subjective clues will lead to poor extraction performance. null For that reason, we adopted a learning method that binarily classifies sentences by using declaratively subjective clues and their positions in sentences as feature parameters of an SVM.</Paragraph> <Paragraph position="2"> With this method, a consistent framework of classification can be maintained even if we add new declaratively subjective clues, and it is possible that we can extract the opinion-expressing sentences which have unknown declaratively subjective clues.</Paragraph> <Section position="1" start_page="42" end_page="42" type="sub_section"> <SectionTitle> 3.1 Augmentation by Semantic Categories </SectionTitle> <Paragraph position="0"> Before we can use declaratively subjective clues as feature parameters, we must address two issues: null * Cost of building a corpus: It is costly to provide a sufficient amount of tagged corpus of opinion-expressing-sentence labels to ensure that learning achieves a high-performance extraction capability. * Coverage of words co-occurring with declaratively subjective clues: Many of the declaratively subjective clue expressions have co-occurring words in the opinion-expressing sentence. Consider the following two sentences.</Paragraph> <Paragraph position="1"> (5) The sky is high.</Paragraph> <Paragraph position="2"> (6) The quality of this product is high. Both (5) and (6) contain the word &quot;high&quot; in the predicate part. Sentence (5) is considered to be less of an opinion than (6) because an evaluator might judge (5) to be the objective truth, while all evaluators are likely to judge (6) to be an opinion. The adjective &quot;high&quot; in the predicate part can be validated as a declaratively subjective clue depending on co-occurring words.</Paragraph> <Paragraph position="3"> However, it is not realistic to provide all possible co-occurring words with each declaratively subjective clue expression. Semantic categories can be of help in dealing with the above two issues. Declaratively subjective clue expressions can be augmented by semantic categories of the words in the expressions. An augmentation involving both declaratively subjective clues and co-occurrences will increase feature parameters. In our implementation, we adopted the semantic categories proposed by Ikehara et al. (1997). Utilization of semantic categories has another effect: it improves the extraction performance. Consider the following two sentence patterns:</Paragraph> <Paragraph position="5"> The words &quot;beautiful&quot; and &quot;pretty&quot; are adjectives in the common semantic category, &quot;appearance&quot;, and the degree of declarative subjectivity of these sentences is almost the same regardless of what X is. Therefore, even if &quot;beautiful&quot; is learned as a declaratively subjective clue but &quot;pretty&quot; is not, the semantic category &quot;appearance&quot; that the learned word &quot;beautiful&quot; belongs to, enables (8) to be judged opinion expressing as well as (7).</Paragraph> </Section> <Section position="2" start_page="42" end_page="43" type="sub_section"> <SectionTitle> 3.2 Feature Parameters to Learn </SectionTitle> <Paragraph position="0"> We implemented our opinion-sentence extraction method by using a Support Vector Machine (SVM) because an SVM can efficiently learn the model for classifying sentences into opinion-expressing and non-opinion expressing, based on the combinations of multiple feature parameters.</Paragraph> <Paragraph position="1"> The following are the crucial feature parameters of our method.</Paragraph> <Paragraph position="2"> * 2,936 declaratively subjective clues * 2,715 semantic categories that words in a sentence can fall into If the sentence has a declaratively subjective clue of type (b), (c) or (l) in Table 2, the feature parameter about the clue is assigned a value of 1; if not, it is assigned 0. If the sentence has declaratively subjective clues belonging to types other than (b), (c) or (l) in the predicate part, the feature parameter about the clue is assigned 1; if not, it is assigned 0.</Paragraph> <Paragraph position="3"> The feature parameters for the semantic category are used to compensate for the insufficient amount of declaratively subjective clues provided and to consider co-occurring words with clue expressions in the opinion-expressing sentences, as mentioned in Section 3.1.</Paragraph> <Paragraph position="4"> The following are additional feature parameters. null</Paragraph> <Paragraph position="6"> Each feature parameter is assigned a value of 1 if the sentence has any of the frequent words or parts of speech. We added these feature parameters based on the hypotheses that some frequent words in Japanese have the function of changing the degree of declarative subjectivity, and that the existence of such parts of speech as adjectives and adverbs possibly influences the declarative subjectivity. The effectiveness of these additional feature parameters was confirmed in our preliminary experiment.</Paragraph> </Section> </Section> <Section position="6" start_page="43" end_page="44" type="metho"> <SectionTitle> 4 Experiments </SectionTitle> <Paragraph position="0"> We conducted three experiments to assess the validity of the proposed method: comparison with baseline methods, effectiveness of position information in SVM feature parameters, and effectiveness of SVM feature parameters such as declaratively subjective clues and semantic categories. null All experiments were performed using the Japanese sentences described in Section 2.1. We used 8,425 opinion expressing sentences, which were used to collect declaratively subjective clues as a training set, and used 4,938 opinion-expressing sentences as a test set. We also used 26,340 non-opinion sentences as a training set and used 16,006 non-opinion sentences as a test set. The test set was divided into ten equal subsets. The experiments were evaluated with the following measures following the variable scheme in Table 3: We evaluated ten subsets with the above measures and took the average of these results.</Paragraph> <Section position="1" start_page="43" end_page="44" type="sub_section"> <SectionTitle> 4.1 Comparison with Baseline Methods </SectionTitle> <Paragraph position="0"> We first performed an experiment comparing two baseline methods with our proposed method.</Paragraph> <Paragraph position="1"> We prepared a baseline method that regards a sentence as an opinion if it contains a number of declaratively subjective clues that exceeds a certain threshold. The best threshold was set through trial and error at five occurrences. We also prepared another baseline method that learns a model and classifies a sentence using only features about a bag of words.</Paragraph> <Paragraph position="2"> The experimental results are shown in Table 4.</Paragraph> <Paragraph position="3"> It can be seen that our method performs better than the two baseline methods. Though the difference between our method's results and those of the bag-of-words method seems rather small, the superiority of the proposed method cannot be rejected at the significance level of 5% in t-test.</Paragraph> </Section> </Section> <Section position="7" start_page="44" end_page="45" type="metho"> <SectionTitle> 4.2 Feature Parameters with Position In- </SectionTitle> <Paragraph position="0"> formation We inspected the effect of position information of 2,936 declaratively subjective clues based on the heuristic rule that a Japanese predicate part almost always appears in the last ten words in a sentence. Instead of more precisely identifying predicate position from parsing information, we employed this heuristic rule as a feature parameter in the SVM learner for practical reasons. Table 5 lists the experimental results. &quot;All words&quot; indicates that all feature parameters are permitted at any position in the sentence. &quot;Last 10 words&quot; indicates that all feature parameters are permitted only if they occur within the last ten words in the sentence.</Paragraph> <Paragraph position="1"> We can see that feature parameters with position information perform better than those without position information in all evaluations. This result confirms our claim that the position of the feature parameters is important for judging whether a sentence is an opinion or not.</Paragraph> <Paragraph position="2"> However, the difference did not indicate superiority between the two results at the significance level of 5%. In the &quot;last 10 word&quot; experiment, we restricted the position of 422 declaratively subjective clues like (b), (c) and (l) in Table 2, which appear in any position of a sentence, to the same conditions as with the other types of 2,514 declaratively subjective clues. The fact that the equal position restriction on all declaratively subjective clues slightly improved performance suggests there will be significant improvement in performance from assigning the individual position condition to each declaratively subjective clue.</Paragraph> <Section position="1" start_page="44" end_page="45" type="sub_section"> <SectionTitle> 4.3 Effect of Feature Parameters </SectionTitle> <Paragraph position="0"> The third experiment was designed to ascertain the effects of declaratively subjective clues and semantic categories. The declaratively subjective clues and semantic categories were employed as feature parameters for the SVM learner. The effect of each particular feature parameter can be seen by using it without the other feature parameter, because the feature parameters are independent of each other.</Paragraph> <Paragraph position="1"> The experimental results are shown in Table 6.</Paragraph> <Paragraph position="2"> The first row shows trials using only frequent words and parts of speech as feature parameters.</Paragraph> <Paragraph position="3"> &quot;Y&quot; in the first and second columns indicates exclusive use of declaratively subjective clues and semantic categories as the feature parameters, respectively. For instance, we can determine the effect of declaratively subjective clues by comparing the first row with the second row.</Paragraph> <Paragraph position="4"> The results show the effects of declaratively subjective clues and semantic categories. The results of the first row show that the method using only frequent words and parts of speech as the feature parameters cannot precisely classify subjective sentences. Additionally, the last row of the results clearly shows that using both declaratively subjective clues and semantic categories as the feature parameters is the most effective. The difference between the last row of the results and the other rows cannot be rejected even at the significance level of 5%.</Paragraph> </Section> </Section> class="xml-element"></Paper>