File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0813_intro.xml

Size: 2,438 bytes

Last Modified: 2025-10-06 14:02:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0813">
  <Title>The Basque Country University system: English and Basque tasks</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Our group (BCU, Basque Country University), participated in the Basque and English lexical sample tasks in Senseval-3. We applied 4 different learning algorithms (Decision Lists, Naive Bayes, Vector Space Model, and Support Vector Machines), and also a method that combined their outputs. These algorithms were previously tested and tuned on the Senseval-2 data for English. Before submission, the performance of the methods was tested for each task on the Senseval-3 training data using 10 fold cross validation. Finally, two systems were submitted for each language, the best single algorithm and the best ensemble in cross-validation.</Paragraph>
    <Paragraph position="1"> The main difference between the Basque and English systems was the feature set. A rich set of features was used for English, including syntactic dependencies and domain information, extracted with different tools, and also from external resources like WordNet Domains (Magnini and Cavagli'a, 2000). The features for Basque were different, as Basque is an agglutinative language, and syntactic information is given by inflectional suffixes. We tried to represent this information in local features, relying on the analysis of a deep morphological analyzer developed in our group (Aduriz et al., 2000).</Paragraph>
    <Paragraph position="2"> In order to improve the performance of the algorithms, different smoothing techniques were tested on the English Senseval-2 lexical sample data (Agirre and Martinez, 2004), and applied to Senseval-3. These methods helped to obtain better estimations for the features, and to avoid the problem of 0 counts Decision Lists and Naive Bayes.</Paragraph>
    <Paragraph position="3"> This paper is organized as follows. The learning algorithms are first introduced in Section 2, and Section 3 describes the features applied to each task. In Section 4, we present the experiments performed on training data before submission; this section also covers the final configuration of each algorithm, and the performance obtained on training data. Finally, the official results in Senseval-3 are presented and discussed in Section 5.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML