File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/c04-1201_evalu.xml
Size: 6,081 bytes
Last Modified: 2025-10-06 13:59:09
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1201"> <Title>A Language Independent Method for Question Classification</Title> <Section position="5" start_page="0" end_page="0" type="evalu"> <SectionTitle> 4 Experimental Evaluation </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Data sets </SectionTitle> <Paragraph position="0"> The data set used in this work consists of the questions provided in the DISEQuA Corpus (Magnini et al., 2003). Such corpus was made up of simple, mostly short, straightforward and factual queries that sound naturally spontaneous, and arisen from a real desire to know something about a particular event or situation.</Paragraph> <Paragraph position="1"> The DISEQuA Corpus contains 450 questions, each one formulated in four languages: Dutch, English, Italian and Spanish. The questions are classified into seven categories: Person, Organization, Measure, Date, Object, Other and Place. The experiments performed in this work used the English, Italian and Spanish versions of these questions.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Experiments </SectionTitle> <Paragraph position="0"> In the experiments performed in this work we used the evaluation technique 10-fold cross-validation which consists of randomly dividing the data into 10 equally-sized subgroups and performing 10 different experiments. We separated nine groups together with their original classes as the training set, the remaining group was considered the test set. Each experiment consists of ten runs of the procedure described above, and the overall average are the results reported here.</Paragraph> <Paragraph position="1"> In our experiments we used the WEKA implementation of SVM (Witten and Frank, 1999).</Paragraph> <Paragraph position="2"> In this setting multi-class problems are solved using pairwise classification. The optimization algorithm used for training the support vector classifier is an implementation of Platt's sequential minimal optimization algorithm (Platt, 1999). The kernel function used for mapping the input space was a polynomial of exponent one.</Paragraph> <Paragraph position="3"> The most common approach to question classification is bag-of-words, so we decided to compare results of using bag-of-words against using just prefixes of the words in the questions. In based attributes order to choose an appropriate prefix size we compute the average length of the words in the three languages used in this work. For English the average length of words is 4.62, for Italian is 4.8 while for Spanish the average length is 4.75. So we decided to experiment with prefixes of size 4 and 5. In Table 3 we can see a comparison of classification accuracy of training SVM using all the words in the questions, using prefixes of size 4 and 5 and using only the Internet-based attributes. As we can see for English the best results were obtained when using words as attributes, although the difference between using just prefixes and using words is not so large. For Spanish however, the best results were achieved when using prefixes of size 5. This can be due to the fact that some of the interrogative words, that by themselves can define the semantic class of questions in this language, such as Cu'ando (When) and Cu'anto (How much) could be considered as the same prefix of size 4 i.e. Cu'an. But if we consider prefixes of size 5, then these two words will form two different prefixes: Cu'and and Cu'ant, thus reducing the loss of information, as oppose to using prefixes of size 4. For Italian language the best results were obtained from using prefixes of size 4. And for the three languages the Internet-based attributes had rather low accuracies, the lowest being for Italian. When we analyzed the results computed for Italian, using our Internet-based attributes, we realized that in many cases we could not get any results to the queries. One plausible explanation for this lack of information, is that the number of Italian documents available on Internet is much smaller than for English and Spanish.</Paragraph> <Paragraph position="4"> Estimates reported in (Kilgarriff and Grefenstette, 2003) show that for Italian the web size in words is 1,845,026,000; while for English and Spanish the web sizes are 76,598,718,000 and 2,658,631,000 respectively. Thus our method was not able to extract as much information as for the other two languages.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.3 Combining Internet-based Attributes with Lexical Features </SectionTitle> <Paragraph position="0"> Results presented in the previous subsection show how by using just lexical information we can train SVM and achieve high accuracies in the three languages. But our goal is to discover the usefulness of using Internet in order to extract attributes for question classification. We performed other experiments combining the lexical attributes with the Internet information in order to discover if we can further improve accuracy. Table 4 show experimental results of this attribute combination and Figure 1 shows a graphical representation of these results.</Paragraph> <Paragraph position="1"> It is interesting to note that for English and Spanish we did gain accuracy when using the Internet features in all the cases. In contrast, for Italian classification accuracy was decreased when incorporating Internet-based attributes to words and prefixes of size 5. We believe that this drop in accuracy for Italian may be due to the weakly supported information extracted from the Internet, Table 3 shows that SVM trained only on the coefficients from the Internet performed worse for Italian. It is not surprising that adding this rather sparse information to the attributes in the Italian language did not produce an advantage in the classifiers performance. null</Paragraph> </Section> </Section> class="xml-element"></Paper>