File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/n06-2020_concl.xml
Size: 1,984 bytes
Last Modified: 2025-10-06 13:55:13
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-2020"> <Title>Evaluation of Utility of LSA for Word Sense Discrimination</Title> <Section position="5" start_page="78" end_page="79" type="concl"> <SectionTitle> 4 Summary and Discussion </SectionTitle> <Paragraph position="0"> In this paper we reported on the first in a series of experiments aimed at examining the sense discrimination utility of LSA-based vector representation of ambiguous words' contexts. Our evaluation of average silhouette values indicates that sense- null based clusters in the latent semantic space are not very tight (their silhouette values are mostly positive, but close to zero). However, they are separated enough to result in sense discrimination accuracy significantly higher than the baseline. We also found that the cosine distance measure outperforms L1 and L2, and that dimensionality reduction for sense-based clusters does not improve the sense discrimination accuracy.</Paragraph> <Paragraph position="1"> Figure2: Average silhouette values The clustering examined in this paper is based on pre-established word sense labels, and the measured accuracy constitutes an upper bound on a sense discrimination accuracy that can be obtained by unsupervised clustering such as EM or segmental K-means. In the next phase of this investigation we plan to do a similar evaluation for clustering obtained without supervision by running K-means algorithm on the same corpus. Since such clustering is based on geometric properties of word vectors, we expect it to have a better tightness as measured by average silhouette value, but, at the same time, lower sense discrimination accuracy.</Paragraph> <Paragraph position="2"> The experiments reported here are based on LSA representation computed using the whole document as a context for the ambiguous word. In the future we plan to investigate the influence of the context size on sense discrimination performance.</Paragraph> </Section> class="xml-element"></Paper>