File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/w03-1812_evalu.xml
Size: 7,778 bytes
Last Modified: 2025-10-06 13:59:02
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1812"> <Title>An Empirical Model of Multiword Expression Decomposability</Title> <Section position="5" start_page="0" end_page="0" type="evalu"> <SectionTitle> 4 Evaluation </SectionTitle> <Paragraph position="0"> LSA was used to build models in which MWEs could be compared with their constituent words.</Paragraph> <Paragraph position="1"> Two models were built, one from the WSJ corpus (indexing NN compounds) and one from the BNC (indexing verb-particles). After removing stopwords, the 50,000 most frequent terms were indexed in each model. From the WSJ, these 50,000 terms included 1,710 NN compounds (with corpus frequency of at least 13) and from the BNC, 461 verb-particles (with corpus frequency of at least 49). We used these models to compare different words, and to find their neighbours. For example, the neighbours of the simplex verb cut and the verb-particles cut out and cut off (from the BNC model) are shown in Table 2. As can be seen, several of the neighbours of cut out are from similar semantic areas as those of cut, whereas those of cut off are quite different.</Paragraph> <Paragraph position="2"> cut (verb) cut out (verb) cut off (verb) cut verb 1.000000 cut out verb 1.000000 cut off verb 1.000000 trim verb 0.529886 fondant nn 0.516956 knot nn 0.448871 slash verb 0.522370 fondant jj 0.501266 choke verb 0.440587 cut nns 0.520345 strip nns 0.475293 vigorously rb 0.438071 cut nn 0.502100 piece nns 0.449555 suck verb 0.413003 reduce verb 0.465364 roll nnp 0.440769 crush verb 0.412301 cut out verb 0.433465 stick jj 0.434082 ministry nn 0.408702 pull verb 0.431929 cut verb 0.433465 glycerol nn 0.395148 fall verb 0.426111 icing nn 0.432307 tap verb 0.383932 hook verb 0.419564 piece nn 0.418780 shake verb 0.381581 recycle verb 0.413206 paste nn 0.416581 jerk verb 0.381284 project verb 0.401246 tip nn 0.413603 put down verb 0.380368 recycled jj 0.396315 hole nns 0.412813 circumference nn 0.378097 prune verb 0.395656 straw nn 0.411617 jn nnp 0.375634 pare verb 0.394991 hook nn 0.402947 pump verb 0.373984 tie verb 0.392964 strip nn 0.399974 nell nnp 0.373768 This reflects the fact that in most of its instances the verb cut off is used to mean &quot;forcibly isolate&quot;. In order to measure this effect quantitatively, we can simply take the cosine similarities between these verbs, finding that sim(cut, cut out) = 0:433 and sim(cut, cut off) = 0:183 from which we infer directly that, relative to the sense of cut, cut out is a clearer case of a simple decomposable MWE than cut off .</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Statistical analysis </SectionTitle> <Paragraph position="0"> In order to get an initial feel for how well the LSA-based similarities for MWEs and their head words correlate with the WordNet-based similarities over those same word pairs, we did a linear regression and Pearson's correlation analysis of the paired data (i.e. the pairing hsimLSA(wordi; mwe); simWN(wordi; mwe)i for each WordNet similarity measure simWN). For both tests, values closer to 0 indicate random distribution of the data, whereas values closer to 1 indicate a strong correlation. The correlation results for NN compounds and verb-particles are presented in Table 3, where R2 refers to the output of the linear regression test and HSO refers to Hirst and St-Onge similarity measure. In the case of NN compounds, the correlation with LSA is very low for all tests, that is LSA is unable to reproduce the relative similarity values derived from WordNet with any reli- null ability. With verb-particles, correlation is notably higher than for NN compounds,3 but still at a low level.</Paragraph> <Paragraph position="1"> Based on these results, LSA would appear to correlate poorly with WordNet-based similarities.</Paragraph> <Paragraph position="2"> However, our main interest is not in similarity per se, but how reflective LSA similarities are of the decomposability of the MWE in question. While taking note of the low correlation with WordNet similarities, therefore, we move straight on to look at the hyponymy test.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Hyponymy-based analysis </SectionTitle> <Paragraph position="0"> We next turn to analysis of correlation between LSA similarities and hyponymy values. Our expectation is that for constituent word-MWE pairs with higher LSA similarities, there is a greater likelihood of the MWE being a hyponym of the constituent word. We test this hypothesis by ranking the constituent word-MWE pairs in decreasing order of LSA similarity, 3Recall that HSO is the only similarity measure which operates over verbs.</Paragraph> <Paragraph position="1"> and partitioning the ranking up into m partitions of equal size. We then calculate the average number of hyponyms per partition. If our hypothesis is correct, the earlier partitions (with higher LSA similarities) will have higher occurrences of hyponyms than the latter partitions.</Paragraph> <Paragraph position="2"> Figure 1 presents the mean hyponymy values across partitions of the NN compound data and verb-particle data, with m set to 3 in each case. For the NN compounds, we derive two separate rankings, based on the similarity between the head noun and NN compound (NN(head)) and the modifier noun and the NN compound (NN(mod)). In the case of the verb-particle data, WordNet has no classification of prepositions or particles, so we can only calculate the similarity between the head verb and verb-particle (VPC(head)). Looking to the curves for these three rankings, we see that they are all fairly flat, nondescript curves. If we partition the data up into low- and high-frequency MWEs, as defined by a threshold of 100 corpus occurrences, we find that the graphs for the low-frequency data (NN(head)LOW and VPC(head)LOW) are both monotonically decreasing, whereas those for high-frequency data (NN(head)HIGH and VPC(head)HIGH) are more haphazard in nature. Our hypothesis of lesser instances of hyponymy for lower similarities is thus supported for low-frequency items but not for high-frequency items, suggesting that LSA similarities are more brittle over high-frequency items for this particular task. The results for the low-frequency items are particularly encouraging given that the LSA-based similarities were found to correlate poorly with WordNet-derived similarities. The results for NN(mod) are more erratic for both low- and high-frequency terms, that is the modifier noun is not as strong a predictor of decomposability as the head noun. This is partially supported by the statistics on the relative occurrence of NN compounds in Word-Net subsumed by their head noun (71.4%) as compared to NN compounds subsumed by their modifier (13.7%).</Paragraph> <Paragraph position="3"> In an ideal world, we would hope that the values for mean hyponymy were nearly 1 for the first partition and nearly 0 for the last. Naturally, this presumes perfect correlation of the LSA similarities with decomposability, but classificational inconsistencies in WordNet also work against us. For example, vice chairman is an immediate hyponym of both chairman and president, but vice president is not a hyponym of president. According to LSA, however, sim(chairman; vice chairman) = :508 and sim(president; vice president) = :551.</Paragraph> <Paragraph position="4"> It remains to be determined why LSA should perform better over low-frequency items, although the higher polysemy of high-frequency items is one potential cause. We intend to further investigate this matter in future research.</Paragraph> </Section> </Section> class="xml-element"></Paper>