File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-0828_metho.xml
Size: 5,425 bytes
Last Modified: 2025-10-06 14:09:13
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0828"> <Title>TALP System for the English Lexical Sample Task</Title> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Experimental Setting </SectionTitle> <Paragraph position="0"> For each binarization approach, we performed a feature selection process consisting of two consecutive steps: POS feature selection: Using the Senseval-2 corpus, an exhaustive selection of the best set of features for each particular Part-of-Speech was performed. These feature sets were taken as the initial sets in the feature selection process of Senseval-3.</Paragraph> <Paragraph position="1"> Word feature selection: We applied a forward(selection)-backward(deletion) two-step procedure to obtain the best feature selection per word. For each word, the process starts with the best feature set obtained in the previous step according to its Part-of-Speech. Now, during selection, we consider those features not selected during POS feature selection, adding all features which produce some improvement. During deletion, we consider only those features selected during POS feature selection, removing all features which produces some improvement. Although this addition-deletion procedure could be iterated until no further improvement is achieved, we only performed a unique iteration because of the computational overhead. One brief experiment (not reported here) for one-vs-all achieves an increase of 2.63% in accuracy for the first iteration and 0.52% for a second one. First iteration improves the accuracy of 53 words and the second improves only 15.</Paragraph> <Paragraph position="2"> Comparing the evolution of these 15 words, the increase in accuracy is of 2.06% for the first iteration and 1.68% for the second one.</Paragraph> <Paragraph position="3"> These results may suggest that accuracy could be increased by this iteration procedure.</Paragraph> <Paragraph position="4"> The result of this process is the selection of the best binarization approach and the best feature set for each individual word.</Paragraph> <Paragraph position="5"> Considering feature selection, we have inspected the selected attributes for all the words and we observed that among these attributes there are features of all four types. The most selected features are the local ones, and among them those of 'first noun/adjective on the left/right'; from topical features the most selected ones are the 'comb' and in a less measure the 'topic'; from the knowledge-based the most selected feature are those of 'sumo' and 'domains labels'; and from syntactical ones, those of 'Yarowsky's patterns'. All the features previously mentioned where selected at least for 50 of the 57 Senseval-3 words. Even so, it is useful the use of all features when a selection procedure is applied. These general features do not work fine for all words. Some words make use of the less selected features; that is, every word is a different problem. Regarding the implementation details of the system, we used SVMlight (Joachims, 2002), a very robust and complete implementation of Support Vector Machines learning algorithms, which is freely available for research purposes5. A simple lineal kernel with a regularization C value of 0.1 was applied. This parameter was empirically decided on the basis of our previous experiments on the Senseval-2 corpus. Additionally, previous tests using non-linear kernels did not provide better results. The selection of the best feature set and the binarization scheme per word described above, have been performed using a 5-fold cross validation procedure on the Senseval-3 training set. The five partitions of the training set were obtained maintaining, as much as possible, the initial distribution of examples per sense.</Paragraph> <Paragraph position="6"> After several experiments considering the 'U' label as an additional regular class, we found that we obtained better results by simply ignoring it. Then, if a training example was tagged only with this label, it was removed from the training set. If the example was tagged with this label and others, the 'U' label was also removed from the learning example.</Paragraph> <Paragraph position="7"> In that way, the TALP system do not assigns 'U' labels to the test examples.</Paragraph> <Paragraph position="8"> Due to lack of time, the TALP system presented at the competition time did not include a complete model selection for the constraint classification binarization setting. More precisely, 14 words were processed within the complete model selection framework, and 43 were adjusted with a fixed one-vs-all approach but a complete feature selection. After the competition was closed, we implemented the constraint classification setting more efficiently and we reprocessed again the data. Section 5 shows the results of both variants.</Paragraph> <Paragraph position="9"> A rough estimation of the complete model selection time for both approaches is the following. The training spent about 12 hours (OVA setting) and 5 days (CC setting) to complete6, suggesting that the main drawback of these approaches is the computational overhead. Fortunately, the process time can be easily reduced: the CC layer could be ported from Perl to C++ and the model selection could be easily parallelized (since the treatment of each word is independent).</Paragraph> </Section> class="xml-element"></Paper>