File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1088_intro.xml
Size: 3,348 bytes
Last Modified: 2025-10-06 14:02:23
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-1088"> <Title>FLSA: Extending Latent Semantic Analysis with features for dialogue act classification</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> In this paper, we propose Feature Latent Semantic Analysis (FLSA) as an extension to Latent Semantic Analysis (LSA). LSA can be thought as representing the meaning of a word as a kind of average of the meanings of all the passages in which it appears, and the meaning of a passage as a kind of average of the meaning of all the words it contains (Landauer and Dumais, 1997). It builds a semantic space where words and passages are represented as vectors. LSA is based on Single Value Decomposition (SVD), a mathematical technique that causes the semantic space to be arranged so as to reflect the major associative patterns in the data. LSA has been successfully applied to many tasks, such as assessing the quality of student essays (Foltz et al., 1999) or interpreting the student's input in an Intelligent Tutoring system (Wiemer-Hastings, 2001).</Paragraph> <Paragraph position="1"> A common criticism of LSA is that it uses only words and ignores anything else, e.g. syntactic information: to LSA, man bites dog is identical to dog bites man. We suggest that an LSA semantic space can be built from the co-occurrence of arbitrary textual features, not just words. We are calling LSA augmented with features FLSA, for Feature LSA.</Paragraph> <Paragraph position="2"> Relevant prior work on LSA only includes Structured Latent Semantic Analysis (Wiemer-Hastings, 2001), and the predication algorithm of (Kintsch, 2001). We will show that for our task, dialogue act classification, syntactic features do not help, but most dialogue related features do. Surprisingly, one dialogue related feature that does not help is the dialogue act history.</Paragraph> <Paragraph position="3"> We applied LSA / FLSA to dialogue act classification. Dialogue systems need to perform dialogue act classification, in order to understand the role the user's utterance plays in the dialogue (e.g., a question for information or a request to perform an action). In recent years, a variety of empirical techniques have been used to train the dialogue act classifier (Samuel et al., 1998; Stolcke et al., 2000). A second contribution of our work is to show that FLSA is successful at dialogue act classification, reaching comparable or better results than other published methods. With respect to a baseline of choosing the most frequent dialogue act (DA), LSA reduces error rates between 33% and 52%, and FLSA reduces error rates between 60% and 78%.</Paragraph> <Paragraph position="4"> LSA is an attractive method for this task because it is straightforward to train and use. More importantly, although it is a statistical theory, it has been shown to mimic many aspects of human competence / performance (Landauer and Dumais, 1997).</Paragraph> <Paragraph position="5"> Thus, it appears to capture important components of meaning. Our results suggest that LSA / FLSA do so also as concerns DA classification. On Map-Task, our FLSA classifier agrees with human coders to a satisfactory degree, and makes most of the same mistakes.</Paragraph> </Section> class="xml-element"></Paper>