File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/a00-1025_intro.xml
Size: 1,677 bytes
Last Modified: 2025-10-06 14:00:43
<?xml version="1.0" standalone="yes"?> <Paper uid="A00-1025"> <Title>Examining the Role of Statistical and Linguistic Knowledge Sources in a General-Knowledge Question-Answering System</Title> <Section position="3" start_page="180" end_page="181" type="intro"> <SectionTitle> 2 System Architecture </SectionTitle> <Paragraph position="0"> The basic architecture of the question-answering system is depicted in Figure 1. It contains two main components: the IR subsystems and the linguistic filters. As a preliminary, ofl\]ine step, the IR sub-system first indexes the text collection from which answers are to be extracted. Given a question, the goal of the IR component is then to return a ranked list of those text chunks (e.g. documents, sentences, or paragraphs) from the indexed collection that are most relevant to the query and from which answer hypotheses can he extracted. Next, the QA system optionally applies one or more linguistic filters to the text chunks to extract an ordered list of answer hypotheses. The top hypotheses are concatenated to form five 50-byte guesses as allowed by the TREC8 guidelines. Note that many of these guesses may be difficult to read and judged as incorrect by the TREC8 assessors: we will also describe the results of generating single phrases as guesses wherever this is possible.</Paragraph> <Paragraph position="1"> In the sections below, we present and evaluate a series of instantiations of this general architecture, each of which makes different assumptions regarding the type of information that will best support the QA task. The next section begins by describing the baseline QA system.</Paragraph> </Section> class="xml-element"></Paper>