File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/p04-1016_evalu.xml
Size: 2,618 bytes
Last Modified: 2025-10-06 13:59:11
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-1016"> <Title>Convolution Kernels with Feature Selection for Natural Language Processing Tasks</Title> <Section position="9" start_page="0" end_page="0" type="evalu"> <SectionTitle> 7 Results and Discussion </SectionTitle> <Paragraph position="0"> Tables 3 and 4 show the results of Japanese and English question classification, respectively. Table 5 shows the results of sentence modality identification. n in each table indicates the threshold of the sub-sequence size. n = 1 means all possible sub-sequences are used.</Paragraph> <Paragraph position="1"> First, SK was consistently superior to BOW-K.</Paragraph> <Paragraph position="2"> This indicates that the structural features were quite efficient in performing these tasks. In general we can say that the use of structural features can improve the performance of NLP tasks that require the details of the contents to perform the task.</Paragraph> <Paragraph position="3"> Most of the results showed that SK achieves its maximum performance when n = 2. The performance deteriorates considerably once n exceeds 4. This implies that SK with larger sub-structures degrade classification performance. These results show the same tendency as the previous studies discussed in Section 3. Table 6 shows the precision and recall of SK when n = 1. As shown in Table 6, the classifier offered high precision but low recall. This is evidence of over-fitting in learning.</Paragraph> <Paragraph position="4"> As shown by the above experiments, FSSK pro- null vided consistently better performance than the conventional methods. Moreover, the experiments confirmed one important fact. That is, in some cases maximum performance was achieved with n = 1. This indicates that sub-sequences created using very large structures can be extremely effective. Of course, a larger feature space also includes the smaller feature spaces, n n+1. If the performance is improved by using a larger n, this means that significant features do exist. Thus, we can improve the performance of some classification problems by dealing with larger substructures. Even if optimum performance was not achieved with n = 1, difference between the performance of smaller n are quite small compared to that of SK. This indicates that our method is very robust as regards sub-structure size; It therefore becomes unnecessary for us to decide sub-structure size carefully. This indicates our approach, using large sub-structures, is better than the conventional approach of eliminating sub-sequences based on size.</Paragraph> </Section> class="xml-element"></Paper>