File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-3003_concl.xml

Size: 2,133 bytes

Last Modified: 2025-10-06 13:55:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3003">
  <Title>Modeling Reference Interviews as a Basis for Improving Automatic QA Systems</Title>
  <Section position="7" start_page="22" end_page="22" type="concl">
    <SectionTitle>
5 Conclusions and Future Work
</SectionTitle>
    <Paragraph position="0"> The reference interview has been implemented as an interactive dialogue between the system and the user, and the full system is near completion. We are currently working on two types of evaluation of our interactive QA capabilities. One is a system-based evaluation in the form of unit tests, the other is a user-based evaluation. The unit tests are designed to verify whether each module is working correctly and whether any changes to the system adversely affect results or performance. Crafting unit tests for complex questions has proved challenging, as no gold standard for this type of question has yet been created. As the data becomes available, this type of evaluation will be ongoing and part of regular system development.</Paragraph>
    <Paragraph position="1"> As appropriate for this evolutionary work within specific domains for which there are not gold standard test sets, our evaluation of the QA systems has focused on qualitative assessments.</Paragraph>
    <Paragraph position="2"> What has been a particularly interesting outcome is what we have learned in elicitation from graduate students using the NASA QA system, namely that they have multiple dimensions on which they evaluate a QA system, not just traditional recall and precision (Liddy et al, 2004). The high level dimensions identified include system performance, answers, database content, display, and expectations. Therefore the evaluation criteria we believe appropriate for IQA systems are centered around the display (UI) category as described in Liddy et al, (2004). We will evaluate aspects of the UI input subcategory, including question understanding, information need understanding, querying style, and question formulation assistance. Based on this user evaluation the system will be improved and retested.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML