File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/02/w02-1906_abstr.xml
Size: 2,980 bytes
Last Modified: 2025-10-06 13:42:42
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-1906"> <Title>Passage Selection to Improve Question Answering</Title> <Section position="2" start_page="0" end_page="1" type="abstr"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Information Retrieval (IR) systems receive as input a user's query, and they have to return a set of documents sorted by their relevance to the query. There are different techniques to carry out the document extraction process, but most of them are based on pattern matching modules that depend on the number of times that a query term appear in each document, as well as the importance or discrimination value of each term in the document collection. Question Answering (QA) systems try to improve the output generated by IR systems by means of returning just small pieces of text that are supposed to contain the response. Usually, QA systems combine IR and Natural Language Processing (NLP) techniques to perform their task. This combination allows text understanding until a minimum level that permits a precise answer detection and extraction. Nevertheless, since NLP techniques are computationally expensive, QA systems need to reduce the amount of text where these techniques have to be applied. In this way, they usually work on the output of IR systems [10] that select the most relevant documents to the query by supposing that they will contain the answer required. Most applied IR systems are mainly based on three models: the cosine model [15], the pivoted cosine model [17], and the probabilistic model (OKAPI [18]).</Paragraph> <Paragraph position="1"> Moreover, IR systems usually employ query expansion techniques that frequently improve their precision. These techniques can be based on thesaurus [21] or on the incorporation of the most frequent terms in the top M relevant documents [7].</Paragraph> <Paragraph position="2"> Currently, several Passage Retrieval (PR) systems have also been proposed for this task [2][5][8][9]. PR systems deal with fragments of text in order to determine the relevance of a document to a query, as well as to detect document extracts that are likely to contain the expected answer (instead of full documents).</Paragraph> <Paragraph position="3"> Although PR systems apply IR-based techniques to perform their work, they have revealed to be more effective than IR systems for QA tasks.</Paragraph> <Paragraph position="4"> In this paper, we are analysing the importance of the IR-n PR system for QA n [11] as it was used in last TREC-10 Conference [19]. The following section briefly presents the backgrounds in IR, PR and QA. Section 3 shows the architecture of IR-n. Section 4 presents the evaluation accomplished and finally, section 5 details conclusions and work in progress.</Paragraph> <Paragraph position="5"> It is a modification of the cosine model. It tries to reduce the problem of the preference for bigger documents.</Paragraph> </Section> class="xml-element"></Paper>