File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/06/e06-1026_relat.xml
Size: 4,739 bytes
Last Modified: 2025-10-06 14:15:50
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-1026"> <Title>Latent Variable Models for Semantic Orientations of Phrases</Title> <Section position="3" start_page="201" end_page="201" type="relat"> <SectionTitle> 2 Related Work </SectionTitle> <Paragraph position="0"> We briefly explain related work from two viewpoints: the classification of word pairs and the identification of semantic orientation.</Paragraph> <Section position="1" start_page="201" end_page="201" type="sub_section"> <SectionTitle> 2.1 Classification of Word Pairs </SectionTitle> <Paragraph position="0"> Torisawa (2001) used a probabilistic model to identify the appropriate case for a pair of words constituting a noun and a verb with the case of the noun-verb pair unknown. Their model is the same as Probabilistic Latent Semantic Indexing (PLSI) (Hofmann, 2001), which is a generative probability model of two random variables. Torisawa's method is similar to ours in that a latent variable model is used for word pairs. However, Torisawa's objective is different from ours. In addition, we used not the original PLSI, but its expanded version, which is more suitable for this task of semantic orientation classification of phrases.</Paragraph> <Paragraph position="1"> Fujita et al. (2004) addressed the task of the detection of incorrect case assignment in automatically paraphrased sentences. They reduced the task to a problem of classifying pairs of a verb and a noun with a case into correct or incorrect.</Paragraph> <Paragraph position="2"> They first obtained a latent semantic space with PLSI and adopted the nearest-neighbors method, in which they used latent variables as features. Fujita et al.'s method is different from ours, and also from Torisawa's, in that a probabilistic model is used for feature extraction.</Paragraph> </Section> <Section position="2" start_page="201" end_page="201" type="sub_section"> <SectionTitle> 2.2 Identification of Semantic Orientations </SectionTitle> <Paragraph position="0"> The semantic orientation classification of words has been pursued by several researchers (Hatzivassiloglou and McKeown, 1997; Turney and Littman, 2003; Kamps et al., 2004; Takamura et al., 2005). However, no computational model for semantically oriented phrases has been proposed to date although research for a similar purpose has been proposed.</Paragraph> <Paragraph position="1"> Some researchers used sequences of words as features in document classification according to semantic orientation. Pang et al. (2002) used bigrams. Matsumoto et al. (2005) used sequential patterns and tree patterns. Although such patterns were proved to be effective in document classification, the semantic orientations of the patterns themselves are not considered.</Paragraph> <Paragraph position="2"> Suzuki et al. (2006) used the Expectation-Maximization algorithm and the naive bayes classifier to incorporate the unlabeled data in the classification of 3-term evaluative expressions. They focused on the utilization of context information such as neighboring words and emoticons. Turney (2002) applied an internet-based technique to the semantic orientation classification of phrases, whichhadoriginallybeendevelopedforwordsentiment classification. In their method, the number of hits returned by a search-engine, with a query consisting of a phrase and a seed word (e.g., &quot;phrase NEAR good&quot;) is used to determine the orientation. Baron and Hirst (2004) extracted collocations with Xtract (Smadja, 1993) and classified the collocations using the orientations of the words in the neighboring sentences. Their method is similar to Turney's in the sense that cooccurrence with seed words is used. The three methods above are based on context information. In contrast, our method exploits the internal structure of the semantic orientations of phrases.</Paragraph> <Paragraph position="3"> Inui (2004) introduced an attribute plus/minus for each word and proposed several rules that determine the semantic orientations of phrases on the basis of the plus/minus attribute values and the positive/negative attribute values of the component words. For example, a rule [negative+minus=positive] determines &quot;low (minus) risk (negative)&quot; to be positive. Wilson et al. (2005) worked on phrase-level semantic orientations. They introduced a polarity shifter, which is almost equivalent to the plus/minus attribute above. They manually created the list of polarity shifters. The method that we propose in this paper is an automatic version of Inui's or Wilson et al.'s idea, in the sense that the method automatically creates word clusters and their polarity shifters.</Paragraph> </Section> </Section> class="xml-element"></Paper>