File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/06/p06-1133_relat.xml
Size: 3,401 bytes
Last Modified: 2025-10-06 14:15:57
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1133"> <Title>Are These Documents Written from Different Perspectives? A Test of Different Perspectives Based On Statistical Distribution Divergence</Title> <Section position="4" start_page="1057" end_page="1057" type="relat"> <SectionTitle> 2 Related Work </SectionTitle> <Paragraph position="0"> There has been interest in understanding how beliefs and ideologies can be represented in computers since mid-sixties of the last century (Abelson and Carroll, 1965; Schank and Abelson, 1977).</Paragraph> <Paragraph position="1"> The Ideology Machine (Abelson, 1973) can simulate a right-wing ideologue, and POLITICS (Carbonell, 1978) can interpret a text from conservative or liberal ideologies. In this paper we take a statistics-based approach, which is very different from previous work that rely very much on manually-constructed knowledge base.</Paragraph> <Paragraph position="2"> Note that what we are interested in is to determine if two document collections are written from different perspectives, not to model individual perspectives. We aim to capture the characteristics, specifically the statistical regularities of any pairs of document collections with opposing perspectives. Given a pair of document collections A and B, our goal is not to construct classifiers that can predict if a document was written from the perspective of A or B (Lin et al., 2006), but to determine if the document collection pair (A,B) convey opposing perspectives.</Paragraph> <Paragraph position="3"> There has been growing interest in subjectivity and sentiment analysis. There are studies on learning subjective language (Wiebe et al., 2004), identifying opinionated documents (Yu and Hatzivassiloglou, 2003) and sentences (Riloff et al., 2003; Riloff and Wiebe, 2003), and discriminating between positive and negative language (Turney and Littman, 2003; Pang et al., 2002; Dave et al., 2003; Nasukawa and Yi, 2003; Morinaga et al., 2002). There are also research work on automatically classifying movie or product reviews as positive or negative (Nasukawa and Yi, 2003; Mullen and Collier, 2004; Beineke et al., 2004; Pang and Lee, 2004; Hu and Liu, 2004).</Paragraph> <Paragraph position="4"> Although we expect by its very nature much of the language used when expressing a perspective to be subjective and opinionated, the task of labeling a document or a sentence as subjective is orthogonal to the test of different perspectives. A subjectivity classifier may successfully identify all subjective sentences in the document collection pair A and B, but knowing the number of subjective sentences in A and B does not necessarily tell us if they convey opposing perspectives. We utilize the subjectivity patterns automatically extracted from foreign news documents (Riloff and Wiebe, 2003), and find that the percentages of the subjective sentences in the bitterlemons corpus (see Section 4) are similar (65.6% in the Palestinian documents and 66.2% in the Israeli documents). The high but almost equivalent number of subjective sentences in two perspectives suggests that perspective is largely expressed in subjective language but subjectivity ratio is not enough to tell if two document collections are written from the same (Palestinian v.s. Palestinian) or different perspectives (Palestinian v.s. Israeli)2.</Paragraph> </Section> class="xml-element"></Paper>