File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/n06-2020_evalu.xml

Size: 5,775 bytes

Last Modified: 2025-10-06 13:59:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-2020">
  <Title>Evaluation of Utility of LSA for Word Sense Discrimination</Title>
  <Section position="4" start_page="77" end_page="78" type="evalu">
    <SectionTitle>
3 Results
</SectionTitle>
    <Paragraph position="0"> We used the line-hard-serve-interest corpus(Leacock et al, 1993), with 1151 instances for 3 noun senses of word &amp;quot;Line&amp;quot;: cord - 373, division 374, and text - 404; 752 instances for 2 adjective senses of word &amp;quot;Hard&amp;quot;: difficult - 376, not yielding to pressure or easily penetrated - 376; 1292 instances for 2 verb senses of word &amp;quot;Serve&amp;quot;: serving a purpose, role or function or acting as - 853, and providing service 439; and 2113 instances for 3 noun senses of word &amp;quot;Interest&amp;quot;: readiness to give attention - 361, a share in a company or business 500, money paid for the use of money -1252.</Paragraph>
    <Paragraph position="1"> For all instances of an ambiguous word in the corpus we computed the corresponding LSA context vectors, and grouped them into clusters according to the sense label given in the corpus. To evaluate the inter-cluster tightness and intra-cluster separation for variable-dimensionality LSA representation we used the following measures:  1. Sense discrimination accuracy. To compute  sense discrimination accuracy the centroid of each sense cluster was computed using 90% of the data.</Paragraph>
    <Paragraph position="2"> We evaluated the sense discrimination accuracy using the remaining 10% of the data reserved for testing by computing for each test context vector the closest cluster centroid and comparing their sense labels. To increase the robustness of this evaluation we repeated this computation 10 times, each time using a different 10% chunk for test data, round-robin style. The sense discrimination accuracy estimated in this way constitutes an upper bound on the sense discrimination performance of unsupervised clustering such as K-means or EM: The sense-based centroids, by definition, are the points with minimal average distance to all the same-sense points in the training set, while the centroids found by unsupervised clustering are based on geometric properties of all context vectors, regardless of their sense label.</Paragraph>
    <Paragraph position="3"> 2. Average Silhouette Value. The silhouette value (Rousseeuw, 1987) for each point is a measure of how similar that point is to points in its own cluster vs. points in other clusters. This measure ranges from +1, indicating points that are very distant from neighboring clusters, through 0, indicating points that are not distinctly in one cluster or another, to -1, indicating points that are probably assigned to the wrong cluster. To construct the silhouette value for each vector i, S(i), the following formula is used:</Paragraph>
    <Paragraph position="5"> where a(i) is an average distance of i-object to all other objects in the same cluster and b(i) is a minimum of average distance of i-object to all objects in other cluster (in other words, it is the average distance to the points in closest cluster among the other clusters). The overall average silhouette value is simply the average of the S(i) for  as a function of LSA dimensionality for different distance/similarity measures, namely L2, L1 and cosine, for the 4 ambiguous words in the corpus.</Paragraph>
    <Paragraph position="6"> Note that the distance measure choice affects not only the classification of a point to the cluster, but also the computation of cluster centroid. For L2 and cosine measures the centroid is simply the average of vectors in the cluster, while for L1 it is the median, i.e., the value of i-th dimension of the cluster centroid vector is the median of values of the i-th dimension of all the vectors in the cluster. As can be seen from the sense discrimination results in Fig. 1, cosine distance, the most frequently used distance measure in LSA applications, has the best performance in for 3 out of 4 words in the corpus. Only for &amp;quot;Hard&amp;quot; does L1 outperforms cosine for low values of LSA dimension. As to the influence of dimensionality reduction on sense discrimination accuracy, our results show that (at least for the cosine distance) the accuracy does not peak at any reduced dimension, rather it increases monotonically, first rapidly and then reaching saturation as the dimension is increased from its lowest value (50 in our experiments) to the full dimension that corresponds to the number of contexts in the corpus.</Paragraph>
    <Paragraph position="7"> These results suggest that the value of dimensionality reduction is not in increasing the sense discrimination power of LSA representation, but in making the subsequent computations more efficient and perhaps enabling working with much larger corpora. For every number of dimensions examined, the average sense discrimination accuracy is significantly better than the baseline that was computed as the relative percentage of the most frequent sense of each ambiguous word in the corpus.</Paragraph>
    <Paragraph position="8"> Figure 2 shows the average silhouette values for the sense-based clusters as a function of the dimensionality of the underlying LSA-based vector representation for the 3 different distance metrics and for the 4 words in the corpus. The average silhouette value is close to zero, not varying significantly for the different number of dimensions and distance measures. Although the measured silhouette values indicate that the sense-based clusters are not very tight, the sense-discrimination accuracy results suggest that they are sufficiently far from each other to guarantee relatively high accuracy.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML