File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/06/e06-1050_relat.xml
Size: 3,808 bytes
Last Modified: 2025-10-06 14:15:51
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-1050"> <Title>A Probabilistic Answer Type Model</Title> <Section position="3" start_page="393" end_page="393" type="relat"> <SectionTitle> 2 Related Work </SectionTitle> <Paragraph position="0"> Light et al. (2001) performed an analysis of the effect of multiple answer type occurrences in a sentence. When multiple words of the same type appear in a sentence, answer typing with fixed types must assign each the same score. Light et al. found that even with perfect answer sentence identification, question typing, and semantic tagging, a system could only achieve 59% accuracy over the TREC-9 questions when using their set of 24 non-overlapping answer types. By computing the probability of an answer candidate occurring in the question contexts directly, we avoid having multiple candidates with the same level of appropriateness as answers.</Paragraph> <Paragraph position="1"> There have been a variety of approaches to determine the answer types, which are also known as Qtargets (Echihabi et al., 2003). Most previous approaches classify the answer type of a question as one of a set of predefined types.</Paragraph> <Paragraph position="2"> Many systems construct the classification rules manually (Cui et al., 2004; Greenwood, 2004; Hermjakob, 2001). The rules are usually triggered by the presence of certain words in the question.</Paragraph> <Paragraph position="3"> For example, if a question contains &quot;author&quot; then the expected answer type is Person.</Paragraph> <Paragraph position="4"> The number of answer types as well as the number of rules can vary a great deal. For example, (Hermjakob, 2001) used 276 rules for 122 answer types. Greenwood (2004), on the other hand, used 46 answer types with unspecified number of rules.</Paragraph> <Paragraph position="5"> The classification rules can also be acquired with supervised learning. Ittycheriah, et al. (2001) describe a maximum entropy based question classification scheme to classify each question as having one of the MUC answer types. In a similar experiment, Li & Roth (2002) train a question classifier based on a modified version of SNoW using a richer set of answer types than Ittycheriah et al.</Paragraph> <Paragraph position="6"> The LCC system (Harabagiu et al., 2003) combines fixed types with a novel loop-back strategy. In the event that a question cannot be classified as one of the fixed entity types or semantic concepts derived from WordNet (Fellbaum, 1998), the answer type model backs off to a logic prover that uses axioms derived form WordNet, along with logicrules, tojustifyphrasesasanswers. Thus, the LCC system is able to avoid the use of a miscellaneous type that often exhibits poor performance. However, the logic prover must have sufficient evidence to link the question to the answer, and general knowledge must be encoded as axioms into the system. In contrast, our answer type model derives all of its information automatically from unannotated text.</Paragraph> <Paragraph position="7"> Answer types are often used as filters. It was noted in (Radev et al., 2002) that a wrong guess about the answer type reduces the chance for the system to answer the question correctly by as much as 17 times. The approach presented here is less brittle. Even if the correct candidate does not have the highest likelihood according to the model, it may still be selected when the answer extraction module takes into account other factors such as the proximity to the matched keywords.</Paragraph> <Paragraph position="8"> Furthermore, a probabilistic model makes it easier to integrate the answer type scores with scores computed by other components in a question answering system in a principled fashion.</Paragraph> </Section> class="xml-element"></Paper>