File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-2910_metho.xml

Size: 22,468 bytes

Last Modified: 2025-10-06 14:10:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2910">
  <Title>Can Human Verb Associations Help Identify Salient Features for Semantic Verb Classification?</Title>
  <Section position="4" start_page="69" end_page="70" type="metho">
    <SectionTitle>
2 Verb Association Data
</SectionTitle>
    <Paragraph position="0"> We obtained human associations to German verbs from native speakers in a web experiment (Schulte im Walde and Melinger, 2005). 330 verbs were selected for the experiment (henceforth: experiment verbs), from different semantic categories, and different corpus frequency bands. Participants were given 55 verbs each, and had 30 seconds per verb to type as many associations as they could. 299 native German speakers participated in the experiment, between 44 and 54 for each verb. In total, we collected 81,373 associations from 16,445 trials; each trial elicited an average of 5.16 responses with a range of 0-16.</Paragraph>
    <Paragraph position="1"> All data sets were pre-processed in the following way: For each target verb, we quanti ed over all responses in the experiment. Table 1 lists the 10 most frequent response types for the verb klagen 'complain, moan, sue'. The responses were not distinguished according to polysemic senses of the verbs. klagen 'complain, moan, sue'  In the clustering experiments to follow, the verb associations are considered as verb features. The underlying assumption is that verbs which are semantically similar tend to have similar associations, and are therefore assigned to common classes. Table 2 illustrates the overlap of associations for the polysemous klagen with a near-synonym of one of its senses, jammern 'moan'. The table lists those associations which were given at least twice for each verb; the total overlap was 35 association types.</Paragraph>
  </Section>
  <Section position="5" start_page="70" end_page="70" type="metho">
    <SectionTitle>
3 Association-based Verb Classes
</SectionTitle>
    <Paragraph position="0"> We performed a standard clustering on the 330 experiment target verbs: The verbs and their features were taken as input to agglomerative (bottom-up) hierarchical clustering. As similarity measure in the clustering procedure (i.e. to determine the distance/similarity for two verbs), we used the skew divergence, a smoothed variant of the Kullback-Leibler divergence (Lee, 2001). The goal of these experiments was not to explore the optimal feature combination; thus, we rely on previous experiments and parameter settings, cf. Schulte im Walde (2003).</Paragraph>
    <Paragraph position="1"> Our claim is that the hierarchical verb classes and their underlying features (i.e. the verb associations) represent a useful basis for a theory-independent semantic classi cation of the German verbs. To support this claim, we validated the assoc-classes against standard approaches to semantic verb classes, i.e. GermaNet as the German Word-Net (Kunze, 2000), and the German counterpart of FrameNet in the Salsa project (Erk et al., 2003). Details of the validation can be found in (Schulte im Walde, 2006); the main issues are as follows.</Paragraph>
    <Paragraph position="2"> We did not directly compare the assoc-classes against the GermaNet/FrameNet classes, since not all of our 330 experiments verbs were covered by the two resources. Instead, we replicated the above cluster experiment for a reduced number of verbs: We extracted those classes from the resources which contain association verbs; light verbs, nonassociation verbs, other classes as well as singletons were disregarded. This left us with 33 classes from GermaNet, and 38 classes from FrameNet. These remaining classi cations are polysemous: The 33 GermaNet classes contain 71 verb senses which distribute over 56 verbs, and the 38 FrameNet classes contain 145 verb senses which distribute over 91 verbs. Based on the 56/91 verbs in the two gold standard resources, we performed two cluster analyses, one for the GermaNet verbs, and one for the FrameNet verbs. As for the complete set of experiments verbs, we performed a hierarchical clustering on the respective subsets of the experiment verbs, with their associations as verb features. The actual validation procedure then used the reduced classi cations: The resulting analyses were evaluated against the resource classes on each level in the hierarchies, i.e. from 56/91 classes to 1 class.</Paragraph>
    <Paragraph position="3"> As evaluation measure, we used a pair-wise measure which calculates precision, recall and a harmonic f-score as follows: Each verb pair in the cluster analysis was compared to the verb pairs in the gold standard classes, and evaluated as true or false positive (Hatzivassiloglou and McKeown, 1993).</Paragraph>
    <Paragraph position="4"> The association-based clusters show overlap with the lexical resource classes of an f-score of 62.69% (for 32 verb classes) when comparing to GermaNet, and 34.68% (for 10 verb classes) when comparing to FrameNet. The corresponding upper bounds are 82.35% for GermaNet and 60.31% for FrameNet.1 The comparison therefore demonstrates considerable overlap between association-based classes and existing semantic classes. The different results for the two resources are due to their semantic background (i.e. capturing synonymy vs. situation-based agreement), the numbers of verbs, and the degrees of ambiguity (an average of 1.6 senses per verb in FrameNet, as compared to 1.3 senses in GermaNet).</Paragraph>
    <Paragraph position="5"> The purpose of the validation against semantic resources was to demonstrate that a clustering as based on the verb associations and a standard clustering setting compares well with existing semantic classes. We take the positive validation results as justi cation to use the assoc-classes as source for cluster information: The clustering de nes the verbs in a common association-based class, and the features which are relevant for the respective class. For example, the 100-class analysis contains a class with the verbs bedauern 'regret', heulen 'cry', jammern 'moan', klagen 'complain, moan, sue', verzweifeln 'become desperate', and weinen 'cry', with the most distinctive features Trauer 'mourning', weinen 'cry', traurig 'sad', Tr&amp;quot;anen 'tears', jammern 'moan', Angst 'fear', Mitleid 'pity', Schmerz 'pain'.</Paragraph>
  </Section>
  <Section position="6" start_page="70" end_page="72" type="metho">
    <SectionTitle>
4 Exploring Semantic Class Features
</SectionTitle>
    <Paragraph position="0"> Our claim is that the features underlying the association-based classes help us guide the feature selection process in future clustering experiments, because we know which semantic classes are based 1The upper bounds are below 100%, because the hierarchical clustering assigns a verb to only one cluster, but the lexical resources contain polysemy. We created a hard version of the lexical resource classes where we randomly chose one sense of each polysemous verb, to calculate the upper bounds.</Paragraph>
    <Paragraph position="1">  on which associations/features. We rely on the assoc-classes in the 100-class analysis of the hierarchical clustering2 and features which exist for at least two verbs in a common class (and therefore hint to a minimum of verb similarity), and compare the associations underlying the assoc-classes with standard corpus-based feature types: We check on how many of the associations we nd among the corpus-based features, such as adverbs, direct object nouns, etc. There are various possibilities to determine corpus-based features that potentially cover the associations; we decided in favour of feature types which have been suggested in related work: a) Grammar-based relations: Previous work on distributional similarity has focused either on a speci c word-word relation (such as Pereira et al. (1993) and Rooth et al. (1999) referring to a direct object noun for describing verbs), or used any syntactic relationship detected by a chunker or a parser (such as Lin (1998) and McCarthy et al. (2003)). We used a statistical grammar (Schulte im Walde, 2003) to lter all verb-noun pairs where the nouns represent nominal heads in NPs or PPs in syntactic relation to the verb (subject, object, adverbial function, etc.), and to lter all verb-adverb pairs where the adverbs modify the verbs.</Paragraph>
    <Paragraph position="2"> b) Co-occurrence window: In previous work (Schulte im Walde and Melinger, 2005), we showed that only 28% of all noun associates were identied by the above statistical grammar as subcategorised nouns, but 69% were captured by a 20-word co-occurrence window in a 200-million word newspaper corpus. This nding suggests to use a co-occurrence window as alternative source for verb features, as compared to speci c syntactic relations.</Paragraph>
    <Paragraph position="3"> We therefore determined the co-occurring words for all experiment verbs in a 20-word window (i.e. 20 words preceding and following the verb), irrespective of the part-of-speech of the co-occurring words. Relying on the verb information extracted for a) and b), we checked for each verb-association pair whether it occurred among the grammar or window pairs. Table 3 illustrates which proportions of the associations we found in the two resource types.</Paragraph>
    <Paragraph position="4"> For the grammar-based relations, we checked argu2The exact number of classes or the verb-per-class ratio are not relevant for investigating the use of associations.</Paragraph>
    <Paragraph position="5"> ment NPs and PPs (as separate sets and together), and in addition we checked verb-noun pairs in the most common speci c NP functions: n refers to the (nominative) intransitive subject, na to the transitive subject, and na to the transitive (accusative) object. For the windows, all checks on co-occurrence of verbs and associations in the whole 200-million word corpus. cut also checks the whole corpus, but disregards the most and least frequent co-occurring words: verb-word pairs were only considered if the co-occurrence frequency of the word over all verbs was above 100 (disregarding low frequency pairs) and below 200,000 (disregarding high frequency pairs). Using the cut-offs, we can distinguish the relevance of high- and low-frequency features. Finally, ADJ, ADV, N, V perform co-occurrence checks for the whole corpus, but breaks down the all results with respect to the association part-of-speech.</Paragraph>
    <Paragraph position="6"> As one would have expected, most of the associations (66%) were found in the 20-word co-occurrence window, because the window is neither restricted to a certain part-of-speech, nor to a certain grammar relation; in addition, the window is potentially larger than a sentence. Applying the frequency cut-offs reduces the overlap of association types and co-occurring words to 58%. Specifying the window results for the part-of-speech types illustrates that the nouns play the most important role in describing verb meaning (39% of the verb association types in the assoc-classes were found among the nouns in the corpus windows, 16% among the verbs, 9% among the adjectives, and 2% among the adverbs).3 The proportions of the nouns with a speci c grammar relationship to the verbs show that we nd more associations among direct objects than intransitive/transitive subjects. This insight con rms the assumption in previous work where only direct object nouns were used as salient features in distributional verb similarity, such as Pereira et al. (1993). However, the proportions are all below 10%. Considering all NPs and/or PPs, we nd that the proportions increase for the NPs, and that the NPs play a more important role than the PPs. This insight con rms work on distributional similarity where not only direct object nouns, but all functional nouns 3Caveat: These numbers correlate with the part-of-speech types of all associate responses: 62% of the responses were nouns, 25% verbs, 11% adjectives, and 2% adverbs.</Paragraph>
    <Paragraph position="7">  were considered as verb features, such as Lin (1998) and McCarthy et al. (2003). Of the adverb associations, we nd only a small proportion among the parsed adverbs. All in all, the proportions of association types among the nouns/adverbs with a syntactic relationship to the verbs are rather low. Comparing the NP/PP proportions with the window noun proportions shows that salient verb features are not restricted to certain syntactic relationships, but also appear in a less restricted context window.</Paragraph>
  </Section>
  <Section position="7" start_page="72" end_page="74" type="metho">
    <SectionTitle>
5 Inducing Verb Classes with
Corpus-based Features
</SectionTitle>
    <Paragraph position="0"> In the nal step, we applied the corpus-based feature types to clusterings. The goal of this step was to determine whether the feature exploration helped to identify salient verb features, and whether we can outperform previous clustering results. The clustering experiments were as follows: The 330 experiment verbs were instantiated by the feature types we explored in Section 4. As for the assoc-classes, we then performed an agglomerative hierarchical clustering. We cut the hierarchy at a level of 100 clusters, and evaluated the clustering against the 100-class analysis of the original assoc-classes. We expect that feature types with a stronger overlap with the association types result in a better clustering result. The assumption is that the associations are salient feature for verb clustering, and the better we model the associations with grammar-based or window-based features, the better the clustering.</Paragraph>
    <Paragraph position="1"> For checking the clusterings with respect to the semantic class type, we also applied the corpus-based features to GermaNet and FrameNet classes.</Paragraph>
    <Paragraph position="2"> * GermaNet: We randomly extracted 100 verb classes from all GermaNet synsets, and created a hard classi cation for these classes, by randomly deleting additional senses of a verb so as to leave only one sense for each verb. This selection made the GermaNet classes comparable to the assoc-classes in size and polysemy.</Paragraph>
    <Paragraph position="3"> The 100 classes contain 233 verbs. Again, we performed an agglomerative hierarchical clustering on the verbs (as modelled by the different feature types). We cut the hierarchy at a level of 100 clusters, which corresponds to the number of GermaNet classes, and evaluated against the GermaNet classes.</Paragraph>
    <Paragraph position="4"> * FrameNet: In a pre-release version from May 2005, there were 484 verbs in 214 German FrameNet classes. We disregarded the high-frequency verbs gehen, geben, sehen, kommen, bringen which were assigned to classes mostly on the basis of multi-word expressions they are part of. In addition, we disregarded two large classes which contained mostly support verbs, and we disregarded singletons. Finally, we created a hard classi cation of the classes, by randomly deleting additional senses of a verb so as to leave only one sense for each verb. The classi cation then contained 77 classes with 406 verbs. Again, we performed an agglomerative hierarchical clustering on the verbs (as modelled by the different feature types). We cut the hierarchy at a level of 77 clusters, which corresponds to the number of FrameNet classes, and evaluated against the FrameNet classes.</Paragraph>
    <Paragraph position="5"> For the evaluation of the clustering results, we calculated the accuracy of the clusters, a cluster similarity measure that has been applied before, cf. (Stevenson and Joanis, 2003; Korhonen et al., 2003).4 Accuracy is determined in two steps: 4Note that we can use accuracy for the evaluation because we have a xed cut in the hierarchy as based on the gold standard, as opposed to the evaluation in Section 3 where we explored the optimal cut level.</Paragraph>
    <Paragraph position="6">  standard class with the largest intersection of verbs is determined. The number of verbs in the intersection ranges from one verb only (in case all clustered verbs are in different classes in the gold standard) to the total number of verbs in a cluster (in case all clustered verbs are in the same gold standard class).</Paragraph>
    <Paragraph position="7"> 2. Accuracy is calculated as the proportion of the verbs in the clusters covered by the same gold standard classes, divided by the total number of verbs in the clusters. The upper bound of the accuracy measure is 1.</Paragraph>
    <Paragraph position="8"> Table 4 shows the accuracy results for the three types of classi cations (assoc-classes, GermaNet, FrameNet), and the grammar-based and window-based features. We added frame-based features, as to compare with earlier work: The frame-based features provide a feature description over 183 syntactic frame types including PP type speci cation (fpp), and the same information plus coarse selectional preferences for selected frame slots, as obtained from GermaNet top-level synsets (f-pp-pref), cf. (Schulte im Walde, 2003). The following questions are addressed with respect to the result table.  1. Do the results of the clusterings with respect to the underlying feature types correspond to the overlap of associations and feature types, cf. Table 3? 2. Do the corpus-based feature types which were identi ed on the basis of the associations out-perform previous clustering results? 3. Do the results generalise over the semantic  class type? First of all, there is no correlation between the overlap of associations and feature types on the one hand and the clustering results as based on the feature types on the other hand (Pearson's correlation, p&gt;.1), neither for the assoc-classes or the GermaNet or FrameNet classes. The human associations therefore did not contribute to identify salient feature types, as we had hoped. In some speci c cases, we nd corresponding patterns; for example, the clustering results for the intransitive and transitive sub-ject and the transitive object correspond to the overlap values for the assoc-classes and FrameNet: n &lt; na &lt; na. Interestingly, the GermaNet clusterings behave in the opposite direction.</Paragraph>
    <Paragraph position="9"> Comparing the grammar-based relations with each other shows that for the assoc-classes using all NPs is better than restricting the NPs to (subject) functions, and using both NPs and PPs is best; similarly for the FrameNet classes where using all NPs is the second best results (but adverbs). Differently, for the GermaNet classes the speci c function of intransitive subjects outperforms the more general feature types, and the PPs are still better than the NPs. We conclude that not only there is no correlation between the association overlap and feature types, but in addition the most successful feature types vary strongly with respect to the gold standard. None of the differences within the feature groups (n/na/na and NP/PP/NP&amp;PP) are signi cant</Paragraph>
    <Paragraph position="11"> are surprisingly successful in all three clusterings, in some cases outperforming the noun-based features.</Paragraph>
    <Paragraph position="12"> Comparing the grammar-based clustering results with previous results, the grammar-based features outperform the frame-based features in all clusterings for the GermaNet verbs. For the FrameNet  verbs and the experiment verbs, they outperform the frame-based features only in speci c cases. The adverbial features outperform the frame-based features in any clustering. However, none of the differences between the frame-based clusterings and the grammar-based clusterings are signi cant (kh2,df = 1,a = 0.05).</Paragraph>
    <Paragraph position="13"> For all gold standards, the best window-based clustering results are below the best grammar-based results. Especially the all results demonstrate once more the missing correlation between association/feature overlap and clustering results. However, it is interesting that the clusterings based on window co-occurrence are not signi cantly worse (and in some cases even better) than the clusterings based on selected grammar-based functions. This means that a careful choice and extraction of speci c relationships for verb features does not have a signi cant impact on semantic classes.</Paragraph>
    <Paragraph position="14"> Comparing the window-based features against each other shows that even though we discovered a much larger proportion of association types in an unrestricted window all than elsewhere, the results in the clusterings do not differ accordingly. Applying the frequency cut-offs has almost no impact on the clustering results, which means that it does no harm to leave away the rather unpredictable features. Somehow expected but nevertheless impressive is the fact that only considering nouns as co-occurring words is as successful as considering all words independent of the part-of-speech.</Paragraph>
    <Paragraph position="15"> Finally, the overall accuracy values are much better for the GermaNet clusterings than for the experiment-based and the FrameNet clusterings.</Paragraph>
    <Paragraph position="16"> The differences are all signi cant (kh2,df = 1,a = 0.05). The reason for these large differences could be either (a) that the clustering task was easier for the GermaNet verbs, or (b) that the differences are caused by the underlying semantics. We argue against case (a) since we deliberately chose the same number of classes (100) as for the association-based gold standard; however, the verbs-per-class ratio for GermaNet vs. the assoc-classes and the FrameNet classes is different (2.33 vs. 3.30/5.27) and we cannot be sure about this in uence. In addition, the average verb frequencies in the GermaNet classes (calculated in a 35 million word newspaper corpus) are clearly below those in the other two classi cations (1,040 as compared to 2,465 and 1,876), and there are more low-frequency verbs (98 out of 233 verbs (42%) have a corpus frequency below 50, as compared to 41 out of 330 (12%) and 54 out of 406 (13%)). In the case of (b), the difference in the semantic class types is modelling synonyms with GermaNet as opposed to situation-based agreement in FrameNet. The association-based class semantics is similar to FrameNet, because the associations are unrestricted in their semantic relation to the experiment verb (Schulte im Walde and Melinger, 2005).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML