File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-0308_concl.xml
Size: 2,041 bytes
Last Modified: 2025-10-06 13:55:29
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-0308"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Towards a validated model for affective classification of texts</Title> <Section position="8" start_page="61" end_page="61" type="concl"> <SectionTitle> 5 Conclusion and Future Work </SectionTitle> <Paragraph position="0"> In this paper, we have used a machine learning approach to show that there is a relation between the semantic content of texts and the affective state they (wish to) convey, so that a typology of affective states based on semantic association is a good description of the distribution of affect in a two-dimensional space. Using automated methods to score semantic association, we have demonstrated a method to compute semantic orientation on both dimensions, giving some insights into how to go beyond the customary 'sentiment' analysis. In the classi cation experiments, accuracies were always above a random baseline, although not always statistically signi cant. To improve the typology and the accuracies of classi ers based on it, a better calibration of the activity axis is the most pressing task. Our next steps are experiments aiming at re ning the translation of scores to normalized measures, so that individual affects can be distinguished within a single quadrant. Other interesting avenues are studies investigating how well the typology can be ported to other textual data domains, the inclusion of a 'neutral' tag, and the treatment of texts with multiple affects.</Paragraph> <Paragraph position="1"> Finally, the domain of weblog posts is attractive because of the easy access to annotated data, but we have found through our experiments that the content is very noisy, annotation is not always consistent among 'bloggers', and therefore classi cation is dif cult. We should not underestimate the positive effects that cleaner data, consistent tagging and access to bigger corpora would have on the accuracy of the classi er.</Paragraph> </Section> class="xml-element"></Paper>