File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-2029_concl.xml
Size: 1,894 bytes
Last Modified: 2025-10-06 13:53:30
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-2029"> <Title>Automatic Derivation of Surface Text Patterns for a Maximum Entropy Based Question Answering System</Title> <Section position="7" start_page="0" end_page="0" type="concl"> <SectionTitle> 6 Conclusion and Future Work </SectionTitle> <Paragraph position="0"> Not surprisingly, the PAT ONLY system shows only average performance as compared to other TREC-10 systems. This is because the system has no information about the question except about its expected answertype. Hence, the PAT ONLY system would answer all the questions involving TIME such as: &quot;When was A born?&quot;, &quot;When did A die?&quot;, &quot;Which year did A start attending college?&quot;, &quot;When did A author book B?&quot;with the same answer! Nonetheless, the ME PAT results show that surface text patterns are useful for a Question Answering System.</Paragraph> <Paragraph position="1"> Although in these experiments a feature set of 22,353 patterns was trained on approximately 210,000 instances, only 1500 patterns was actually found in the final training data which had a count of at least 8 instances. This suggests that the approach used here to train weights suffers from the problem of having very little training data as compared to the number of features. A much better approach would be to train the weights of the patterns from the unsupervised collection itself. However, the effect of noise introduced due to such unsupervised training is unclear. null The above technique represents a very clean approach to integrating the use of patterns into a QA system. Most of the rule based systems take years to engineer and are very difficult to duplicate. However, a good statistical system can be duplicated to give good performance in a relatively short amount of time.</Paragraph> </Section> class="xml-element"></Paper>