File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/n03-2030_intro.xml

Size: 2,657 bytes

Last Modified: 2025-10-06 14:01:43

<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-2030">
  <Title>A Hybrid Approach to Content Analysis for Automatic Essay Grading</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In this paper we describe CarmelTC a0 , a novel automatic essay grading approach using a hybrid text classification technique for analyzing essay answers to qualitative physics questions inside the Why2 tutorial dialogue system (VanLehn et al., 2002). In contrast to many previous approaches to automated essay grading (Burstein et al., 1998; Foltz et al., 1998; Larkey, 1998), our goal is not to assign a letter grade to student essays. Instead, our purpose is to tally which set of &amp;quot;correct answer aspects&amp;quot; are present in student essays. Previously, tutorial dialogue systems such as AUTO-TUTOR (Wiemer-Hastings et al., 1998) and Research Methods Tutor (Malatesta et al., 2002) have used LSA (Landauer et al., 1998) to perform the same type of content analysis for student essays that we do in Why2. While Bag of Words approaches such as LSA have performed successfully on the content analysis task in domains such as Computer Literacy (Wiemer-Hastings et al., 1998), they have been demonstrated to perform poorly in causal domains such as research methods (Malatesta et al., 2002) because they base their predictions only on the words included in a text and not on the functional relationships between them. Thus, we propose CarmelTC as an alternative. CarmelTC is a rule learning text classification approach that bases its predictions both on features extracted from CARMEL's deep a1 This research was supported by the ONR, Cognitive Science Division under grant number N00014-0-1-0600 and by NSF grant number 9720359 to CIRCLE.</Paragraph>
    <Paragraph position="1"> syntactic functional analyses of texts (Ros'e, 2000) and a &amp;quot;bag of words&amp;quot; classification of that text obtained from Rainbow Naive Bayes (McCallum and Nigam, 1998).</Paragraph>
    <Paragraph position="2"> We evaluate CarmelTC in the physics domain, which is a highly causal domain like research methods. In our evaluation we demonstrate that CarmelTC outperforms both Latent Semantic Analysis (LSA) (Landauer et al., 1998) and Rainbow Naive Bayes (McCallum and Nigam, 1998), as well as a purely symbolic approach similar to (Furnkranz et al., 1998). Thus, our evaluation demonstrates the advantage of combining predictions from symbolic and &amp;quot;bag of words&amp;quot; approaches for content analysis aspects of automatic essay grading.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML