File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/w00-1322_concl.xml

Size: 3,282 bytes

Last Modified: 2025-10-06 13:52:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1322">
  <Title>An Empirical Study of the Domain Dependence of Supervised Word Sense Disambiguation Systems*</Title>
  <Section position="8" start_page="177" end_page="178" type="concl">
    <SectionTitle>
5 Conclusions and Further Work
</SectionTitle>
    <Paragraph position="0"> This work has pointed out some difficulties regarding the portability of supervised WSD systems, a very important issue that has been paid little attention up to the present.</Paragraph>
    <Paragraph position="1"> According to our experiments, it seems that the performance of supervised sense taggers is not guaranteed when moving from one domain to another (e.g. from a balanced corpus, such as BC, to an economic domain, such as WSJ).</Paragraph>
    <Paragraph position="2"> These results implies that some_kind of adaptation is required for cross-corpus application.  Furthermore, these results are in contradiction with the idea of &amp;quot;robust broad-coverage WSD&amp;quot; introduced by (Ng, 1997b), in which a supervised system trained on a large enough corpora (say a thousand examples per word) ~hould provide accurate disambiguation on any corpora (or, at least significantly better than MFS).</Paragraph>
    <Paragraph position="3"> Consequently, it is our belief that a number of issues regarding portability, tuning, knowledge acquisition, etc., should be thoroughly studied before stating that the supervised ML paradigm is able to resolve a realistic WSD problem.</Paragraph>
    <Paragraph position="4"> Regarding the M L algorithms tested, the contribution of this work consist of empirically demonstrating that the LazyBoosting algorithm outperforms other three state-of-the-art supervised ML methods for WSD. Furthermore. this algorithm is proven to have better properties when is applied to new domains.</Paragraph>
    <Paragraph position="5"> Further work is planned to be done in the following directions: * Extensively evaluate LazyBoosting on the WSD task. This would include taking into account additional/alternative attributes and testing the algorithm in other corpora --specially on sense-tagged corpora automatically obtained from Internet or large text collections using non-supervised methods (Leazock et al., 1998; Mihalcea and Moldovan, 1999).</Paragraph>
    <Paragraph position="6"> * Since most of the knowledge learned from a domain is not useful when changing to a new domain, further investigation is needed on tuning strategies, specially on those using non-supervised algorithms.</Paragraph>
    <Paragraph position="7"> * It is known that mislabelled examples resulting from annotation errors tend to be hard examples to classify correctly, and, therefore, tend to have large weights in the final distribution. This observation allows both to identify the noisy examples and use LazyBoosting as a way to improve data quality. Preliminary experiments have been already carried out in this direction on the DSO corpus.</Paragraph>
    <Paragraph position="8"> * Moreover, the inspection of the rules learned by kazyBoosting could provide evidence about similar behaviours of a-priori different senses. This type of knowledge could be useful to perform clustering of too fine-grained or artificial senses.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML