File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/97/w97-0108_evalu.xml

Size: 16,146 bytes

Last Modified: 2025-10-06 14:00:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0108">
  <Title>J Domain-Specific Semantic Class Disambiguation Using WordNet</Title>
  <Section position="4" start_page="57" end_page="1023" type="evalu">
    <SectionTitle>
3 Evaluation
</SectionTitle>
    <Paragraph position="0"> The domain we worked on is the MUC-4 (1992) terrorism domaln. Nouns are extracted from the first 18  passages (dev-muc4-0001 to dev-muc4-0018) of the corpus of news wire art.ides to form our test corpus. The nouns extracted are the head nouns within noun phrases which are recognised by WordNet, including proper nouns such as &amp;quot;United States&amp;quot;. These 1023 nouns are hand-tagged with their sense and semantic class in the particular context to form the answer keys for subsequent experiments.</Paragraph>
    <Section position="1" start_page="59" end_page="59" type="sub_section">
      <SectionTitle>
3.1 Mapping dom~,;~-specifiC/ hierarchy
onto Word.Net
</SectionTitle>
      <Paragraph position="0"> The domain-specific hierarchy used in our work is that crafWd by researchers fzom the University of Massachusetts for their information extraction system, which was one of the participants at MUC-4 (Riloff, 1994).</Paragraph>
      <Paragraph position="1"> Mapping from the dom~,~-specific hierarchy to WordNer ~3rpically requires only the assignment of senses to the classes. For instance, the semantic class &amp;quot;human&amp;quot; is mapped onto its sense I node in WordNet, the uhuman:l&amp;quot; concept node. Classes can also be mapped onto more than one concept node in WordNet. The semantic class &amp;quot;attack&amp;quot;, for e~ample, is mapped onZo both senses I and 5.</Paragraph>
      <Paragraph position="2"> There are cases where the exact wording of a semantic class in the domain-specific hierarchy is not pre~mt in WordNet. Take for instance the semantic class ~goveroment..ot~.cia/&amp;quot; in the domain-specific hiermx:hy. Since the collocation is not in Word-Net, we mapped it to the concept node ~government.agent:l&amp;quot; which we felt is closest in meaning. The set of mapped semantic cl~-~Ses in WordNet is shown in Figure 3 s.</Paragraph>
    </Section>
    <Section position="2" start_page="59" end_page="59" type="sub_section">
      <SectionTitle>
3.2 Word Sense Dis~mhigzmtlon
</SectionTitle>
      <Paragraph position="0"> We ran our two/mplementstions of word sense disambiguation algorithms, the information content algorithm and the conceptual density method, on our domain-specific rest set. For the information content algorithm, a window size of 10, i.e. 5 nouns to the lefz and right, was found to yield the best results; w~1~t for the conceptual density algorithm, the optimum window size was found to be 30. For both algorithm% only the nouns Of the same passage are incorporated into the context window. If the noun to be disambiguated is the first noun of the passage, the window will include the subsequent .N nouns of the same passage.</Paragraph>
      <Paragraph position="1"> The probability statistics required for Resuik's tneematton C/~umC/ algoctchm were eonecmd sAs this hie~zchy is adopted, and not created by us, occasionally, &amp;quot;we can only furnish guesses as to the exact meaning of the semantic classes.</Paragraph>
      <Paragraph position="3"> Figure 3 : MUC-4 semantic class hierarchy as mapped onto WordNet.</Paragraph>
      <Paragraph position="4"> 777,857 noun occurrences of the entire Brown corpus and Wall Street Jottrnal corpus.</Paragraph>
      <Paragraph position="5"> The results are shown in Table I. The most frequent baseline is obtained by following the stxategy of always picking sense 1 of WordNet, since Word-Net orders its senses such that sense I is the most likely sense.</Paragraph>
      <Paragraph position="6"> As both algofithm.q performed below the most frequent baseline, it prompted us to evaluate the indicativeness of surrounding nouns for word sense disambiguation. We hence provided 2 h,m~ judges with a randomly selected sample of 80 ex~wples from the 734 polysemic nouns of our test corpus of 1023 e~'~ples. The human judges are provided with the 10 nouns surrounding the word to be disambiguated.</Paragraph>
      <Paragraph position="7"> Based only on these clues, they have to select a single sense of the word in the particular sentence context. Their responses are then tallied with the seusetagged test corpus.</Paragraph>
      <Paragraph position="8"> Table 2 shows the accuracies attained by the human judges. Both judges are able C/o perform sub-scantially better than the most frequent heuristic baseline, despite the seeming)y impoverished knowledge source. Feedback from the)udges reveal possible leverage for future improvements. Firstly, judges reflect that frequently, just one indicative surrounding noun is enough to provide clear evidence for sense disambig~tion. The other nouns will just be glossed over and do not contribute to the decision.</Paragraph>
      <Paragraph position="9"> ALso, indicative nouns may not just hold is-a relationships, which are the only relationships exploited by both algorithms. Rather, they are simply related in some m~-ner to the noun to be disambiguated.</Paragraph>
      <Paragraph position="10"> For instance, a surzounding context including the word &amp;quot;church ~ will indicate a strong support for the &amp;quot;pastor&amp;quot; sense of ~m;~i~ter ~ as opposed 1;o its other se~.ses. These reflections of the human judges seem to point towards the need for an effective method for selecting only particular nouns in the surrounding context as evidence. Use of other relatiouships besides is-a may also help in disambi~tion, as is already expounded by (Sussna, lg93).</Paragraph>
    </Section>
    <Section position="3" start_page="59" end_page="1023" type="sub_section">
      <SectionTitle>
3.3 Semantic Distance Metrics
</SectionTitle>
      <Paragraph position="0"> To evaluate the semantic distance metrics, we feed the se~tic distance mod~e with the correct senses of the entire test corpus and observe the resultant semantic c!~ss disambiguation accuracy.</Paragraph>
      <Paragraph position="1"> The conceptual distance, link probability and de-SCend~mt coverage metrics all require trAversal of 11~1~ from one node to another. However, all of the metrics are commutative, i.e. distance from concept a to b is the same as chat from b to ~ In semantic class disambi~tion, a distinction is necessary since the taxonomic links indicate membership relationships which are not commutative (&amp;quot;aircraft:l&amp;quot; is a &amp;quot;vehicle:l ~ but &amp;quot;vehicle:l ~ need not be an &amp;quot;aircraft:l'). We hence associate different weights to the upwards and downwards traversal of links, with the 25 unique be~ers of Word.Net being the top-most nodes. Upward traversal of links towards the unique beginners are weighted consistently at 0.3 whilst downward traversal of links towards the leaves</Paragraph>
      <Paragraph position="3"> are weighted at 1.7 s.</Paragraph>
      <Paragraph position="4"> .Also, different thresholds axe used for different levels of the domain-specific hierarchy. Since higher level classes, such as the level 0 &amp;quot;human&amp;quot; class, encompasses at wider range of words, it is evident that the thresholds for higher level classes-r~n-ot be stricter than that of lower level classes. For fair comparison of each metric, the best thresholds are arrived through exhaustive searching of a reasonable space 7. The results are detailed in Table 3.</Paragraph>
      <Paragraph position="5"> Accuracy on specific se~,mtic classes refers to an exact match of the pcogram's response with the corpus answer. The general ~n~t;ic class disambiguation accuracy, on the other hand, considers a respouse correct as long as the response class is in the sub-hierarchy which originated, fz'om the same level 0 class as the answer. For example, if the program's reeponse is class =politi~&amp;quot;, whilst the answer is class =lawyer&amp;quot;, since both e\]~qses originated from the same level 0 class =b-m~ ~, this response is considered correct when calculating the general semantic class accuracy. The specific se~tic class disambiguation accuracy is hence the stricter measure.</Paragraph>
      <Paragraph position="6"> It may seem puzzling that semantic class disambiguation does not achieve 100% accuracy even when supplied with the correct senses, i.e. even when the word sense d;~mhiguation module is able m attain 100~0 accuracy, the overall semantic class disambiguation accuracy still lags behind the ideal. Since SThese weights are found to be optimum for all three znetric$.</Paragraph>
      <Paragraph position="7"> ~Integral thresholds are searched for the conceptual distance meetri~ whilst the thresholds of the other mettics are searched in steps of 0.01.</Paragraph>
      <Paragraph position="8">  the taxonomic 1~nlc~ in Word.Net are designed to capture membership of words in classes, it may senn odd that the correct identification of the word sense coupled with the IS-A taxonomic 1~ still do not guarantee correct semantic class disambiguation.</Paragraph>
      <Paragraph position="9"> The reason for this paradox is perceptive di~erences; that between the designers of the MUC-4 domain-specific hierarchy we adopted and the Word-Net hierarchy, and that between the an-orator of the answer corpus and the WordNet designers.</Paragraph>
      <Paragraph position="10"> Take for example the monosemic word &amp;quot;kidnapping&amp;quot;. Its correct semantic class is =a~ack:5 s'. However, it is not a descendant of =attack:Y in Word.Net. The hypemyms of &amp;quot;kidnapping&amp;quot; axe \[capture ~ felony --~ crime --&gt; evil-doing -+ wrong-doing --&gt; activity .-+ act\] and thatt of =attack:5&amp;quot; are \[bakery --~ crime ~ evil.doing ~ wrong-doing ~ activity act\]. Both perceptions of =kidnatpping&amp;quot; are correct. &amp;quot;kidnapping&amp;quot; can be viewed as a form of =attack:Y and ~m+\]~dy, it can be viewed as a form of =capt~re ~ .</Paragraph>
      <Paragraph position="11"> An effective semantic distance metric is hence needed here. The semantic distance module should infer the close distance between the two concept nodes &amp;quot;kidnapping&amp;quot; aud &amp;quot;attack:5&amp;quot; and thus col rectly classify &amp;quot;lddz~ppin~.</Paragraph>
    </Section>
    <Section position="4" start_page="1023" end_page="1023" type="sub_section">
      <SectionTitle>
3.4 Semantic Class Dis~mTdguation
</SectionTitle>
      <Paragraph position="0"> After evaluation of the separate phases, we cornblued the best algorithms of the two phases and evaluated the performance of our semantic class disambiguattion approach. Hence, the most ftequent S=attack:5&amp;quot; refers to an assault on someone whilst '%track:l&amp;quot; refers to the be~n~g of an o~m~rve.</Paragraph>
      <Paragraph position="1">  (Ass-m;~g perfect word sense dls=n'~higuation) ~Format :- (t~o, t~z, t~, t~s), where tz~ is the threshold that is applied to the ith level of the hierarchy. sense heuristic is used for the word sense disambiguation module and the conceptual distance metric is adopted for the semantic distance module It should be emphasized, however, that our al&gt;proach to s~m~-tic class disambiguation need not be coupled with any specific word sense disambiguation algorithm. The most frequent Word.Net sense is chosen simply because current word sense disambiguation algofithm~ still cannot beat the most frequent baseline consistently for all words. Our approach, in effect, allows domain-specific s~-~ic class dis~mBiguation tO latch onto the improvements in the active research area of word sense disambiguation.</Paragraph>
      <Paragraph position="2"> As a baseline, we again sought the most frequent heuristic, which is the occurrence probability of the most frequent senantic class &amp;quot;entity&amp;quot;. 9 We compared our approach with supervised methods C/o contrast their reliance on annotated corpora with our r~nce on WordNet. One of the foremost semantic e.l~C/,S disambiguation system which employs machine learning is the Kenwore framework (Cardie, 1993). Huwever, as we are unable to report comparative tests with K~ore zdeg, we adapted cwo other supervised algorithm% both successfully applied to general word sense di~mhiguation, to the task of semantic class disambiguation.</Paragraph>
      <Paragraph position="3"> The first is the LBXAS algorithm which uses an exemplar-based learning framework s;mil~- to the case-based reasoning foundation of Kenmore (Ng, 1997; Ng and Lee, 1996). L~ was shown to achieve high accuracy as compared to other word sense disambiguation algorithms.</Paragraph>
      <Paragraph position="4"> We also applied Teo et al's Bayesian word sense disambiguation algorithm to the task (Teo et al., 1996). The approach compares favourably with other methods in word sense disambiguation when tested on a common data set of the word &amp;quot;interest&amp;quot;. 9This baseline is also used to evaluate the performance of K~ore (Cardie, 1993).</Paragraph>
      <Paragraph position="5"> ZdegAs work on one of the important input sources, the conceptu~ parser, is underway, per~.___ce results of Kenm~e on S~m~t~ic class dL~higuation cannot yet be reportecL The features used for both supervised algorithms are the local collocations of the surrounding 4 words zz. Local collocation was shown to be the most indicative knowledge source for LBxA8 and these 7 features are the common features used in both LF~X.AS and Teo et al's Bayesian algorithm. Both algorithmg are used for learning the specific sema-tic class of words.</Paragraph>
      <Paragraph position="6"> For both algorithmg, the 1023-sentence test set is randomly partitioned into a 90% training set and a 10% testing set, in proportion with the overall class distribution. The algorithms are trained on the tr~;ng set and then used to dis~tdguate the distinct testing set. This was averaged over 10 runs.</Paragraph>
      <Paragraph position="7"> As with K~more, the tr~-~g set contains features of all the words in the training sentences, and the algorithms are to pick one s~-tic class for each word in the testing set. A word in the testing set need not have occurred in the training set. This is --fflce word sense disambiguation, whereby the training set cont~-~ features of one word, and the algorithm picks one sense for each occurence of this word in the testing set.</Paragraph>
      <Paragraph position="8"> To obtain a g~uge of human performance on this task, we sourced two independent human judgements. Two human judges are presented with a set of 80 sentences randomly selected from the 1023example test corpus, each with a noun to be disambiguated. Based on their understanding of the sentence, each noun is assigned a specific semantic cla.~ of the dom~n-specific hierarchy. Their responses are then compared ag~t the tagged answers of the test corpus.</Paragraph>
      <Paragraph position="9"> The s,~ic class disambiguation results are compiled and tabulated in Table 4. The definitions of general and specific semantic class disambigttation accuracy are detailed in Section 3.3.</Paragraph>
      <Paragraph position="10"> As is evident, our approach outperforms the most frequent heuristic substantially. Also, the perforzZGiven a word win the following sentence segment :12 12 w rz ~'=, the 7features used are 12-h, lz..rl, rl..r2,12, l~, r2 and ~'2, whereby the first 3 features are concatenations off the words.</Paragraph>
      <Paragraph position="12"> mance of both supervised algorithms lag b-hl-d that of our approach. Comparable performance with the two human judges is also achieved.</Paragraph>
      <Paragraph position="13"> It should be noted, though, that the amount of training data available to the supervised algorithms may not be sufficient. Ng and Lee (1996) found that train/rig sets of 1000-1500 e~mples per word are necessary for sense dJ-~mhiguation of one highly ambiguous word. The amount of Er~ining data needed for a supervised learning algorithm to achieve good performance on semantic class disambiguation may be larger than what we have used. Cardie (1993), for instance, used a larger 2056-instance case base in the evaluation of K~ore.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML