File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-3205_metho.xml

Size: 20,975 bytes

Last Modified: 2025-10-06 14:09:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-3205">
  <Title>VERBOCEAN: Mining the Web for Fine-Grained Semantic Verb Relations</Title>
  <Section position="4" start_page="0" end_page="2" type="metho">
    <SectionTitle>
3 Semantic relations among verbs
</SectionTitle>
    <Paragraph position="0"> In this section, we introduce and motivate the specific relations that we extract. Whilst the natural language literature is rich in theories of semantics (Barwise and Perry 1985; Schank and Abelson 1977), large-coverage manually created semantic resources typically only organize verbs into a flat or shallow hierarchy of classes (such as those described in Section 2.2). WordNet identifies synonymy, antonymy, troponymy, and cause. As summarized in Figure 1, Fellbaum (1998) discusses a finer-grained analysis of entailment, while the WordNet database does not distinguish between, e.g., backward presupposition (forget :: know, where know must have happened before forget) from proper temporal inclusion (walk :: step). In formulating our set of relations, we have relied on the finer-grained analysis, explicitly breaking out the temporal precedence between entities.</Paragraph>
    <Paragraph position="1"> In selecting the relations to identify, we aimed at both covering the relations described in WordNet and covering the relations present in our collection  of strongly associated verb pairs. We relied on the strongly associated verb pairs, described in Section 4.4, for computational efficiency. The relations we identify were experimentally found to cover 99 out of 100 randomly selected verb pairs.</Paragraph>
    <Paragraph position="2"> Our algorithm identifies six semantic relations between verbs. These are summarized in Table 1 along with their closest corresponding WordNet category and the symmetry of the relation (whether  Similarity. As Fellbaum (1998) and the tradition of organizing verbs into similarity classes indicate, verbs do not neatly fit into a unified is-a (troponymy) hierarchy. Rather, verbs are often similar or related. Similarity between action verbs, for example, can arise when they differ in connotations about manner or degree of action. Examples extracted by our system include maximize :: enhance, produce :: create, reduce :: restrict.</Paragraph>
    <Paragraph position="3"> Strength. When two verbs are similar, one may denote a more intense, thorough, comprehensive or absolute action. In the case of change-of-state verbs, one may denote a more complete change.</Paragraph>
    <Paragraph position="4"> We identify this as the strength relation. Sample verb pairs extracted by our system, in the order weak to strong, are: taint :: poison, permit :: authorize, surprise :: startle, startle :: shock. Some instances of strength sometimes map to WordNets troponymy relation.</Paragraph>
    <Paragraph position="5"> Strength, a subclass of similarity, has not been identified in broad-coverage networks of verbs, but may be of particular use in natural language generation and summarization applications.</Paragraph>
    <Paragraph position="6"> Antonymy. Also known as semantic opposition, antonymy between verbs has several distinct subtypes. As discussed by Fellbaum (1998), it can arise from switching thematic roles associated with the verb (as in buy :: sell, lend :: borrow). There is also antonymy between stative verbs (live :: die, differ :: equal) and antonymy between sibling verbs which share a parent (walk :: run) or an entailed verb (fail :: succeed both entail try).</Paragraph>
    <Paragraph position="7"> Antonymy also systematically interacts with the happens-before relation in the case of restitutive opposition (Cruse 1986). This subtype is exemplified by damage :: repair, wrap :: unwrap. In terms of the relations we recognize, it can be stated that  ). Examples of antonymy extracted by our system include: assemble :: dismantle; ban :: allow; regard :: condemn, roast :: fry.</Paragraph>
    <Paragraph position="8"> Enablement. This relation holds between two  . Enablement is classified as a type of causal relation by Barker and Szpakowicz (1995). Examples of enablement extracted by our system include: assess :: review and accomplish :: complete.</Paragraph>
    <Paragraph position="9"> Happens-before. This relation indicates that the two verbs refer to two temporally disjoint intervals or instances. WordNets cause relation, between a causative and a resultative verb (as in buy :: own), would be tagged as instances of happens-before by our system. Examples of the happens-before relation identified by our system include marry :: divorce, detain :: prosecute, enroll :: graduate, schedule :: reschedule, tie :: untie.</Paragraph>
  </Section>
  <Section position="5" start_page="2" end_page="21" type="metho">
    <SectionTitle>
4 Approach
</SectionTitle>
    <Paragraph position="0"> We discover the semantic relations described above by querying the Web with Google for lexico-syntactic patterns indicative of each relation. Our approach has two stages. First, we identify pairs of highly associated verbs co-occurring on the Web with sufficient frequency using previous work by Lin and Pantel (2001), as described in Section 4.4. Next, for each verb pair, we tested lexico-syntactic patterns, calculating a score for each possible semantic relation as described in Section 4.2. Finally, as described in Section 4.3, we compare the strengths of the individual semantic relations and, preferring the most specific and then strongest relations, output a consistent set as the final output. As a guide to consistency, we use a simple theory of semantics indicating which semantic relations are subtypes of other ones, and which are compatible and which are mutually exclusive. null</Paragraph>
    <Section position="1" start_page="2" end_page="2" type="sub_section">
      <SectionTitle>
4.1 Lexico-syntactic patterns
</SectionTitle>
      <Paragraph position="0"> The lexico-syntactic patterns were manually selected by examining pairs of verbs in known semantic relations. They were refined to decrease capturing wrong parts of speech or incorrect semantic relations. We used 50 verb pairs and the overall process took about 25 hours.</Paragraph>
      <Paragraph position="1"> We use a total of 35 patterns, which are listed in  lings in the WordNet column refers to terms with the same troponymic parent, e.g. swim and fly.</Paragraph>
      <Paragraph position="2">  Note that our patterns specify the tense of the verbs they accept. When instantiating these patterns, we conjugate as needed. For example, both Xed and Yed instantiates on sing and dance as both sung and danced.</Paragraph>
    </Section>
    <Section position="2" start_page="2" end_page="21" type="sub_section">
      <SectionTitle>
4.2 Testing for a semantic relation
</SectionTitle>
      <Paragraph position="0"> In this section, we describe how the presence of a semantic relation is detected. We test the relations with patterns exemplified in Table 2. We adopt an approach inspired by mutual information to measure the strength of association, denoted</Paragraph>
      <Paragraph position="2"> The probabilities in the denominator are difficult to calculate directly from search engine results. For a given lexico-syntactic pattern, we need to estimate the frequency of the pattern instantiated with appropriately conjugated verbs. For verbs, we need to estimate the frequency of the verbs, but avoid counting other parts-of-speech (e.g. chair as a noun or painted as an adjective). Another issue is that some relations are symmetric (similarity and antonymy), while others are not (strength, enablement, happens-before). For symmetric relations only, the verbs can fill the lexico-syntactic pattern in either order. To address these issues, we esti- null for symmetric relations.</Paragraph>
      <Paragraph position="3"> Here, hits(S) denotes the number of documents containing the string S, as returned by Google. N is the number of words indexed by the search engine</Paragraph>
      <Paragraph position="5"> is a correction factor to obtain the frequency of the verb V in all tenses from the frequency of the pattern to V. Based on several verbs, we have estimated C v = 8.5. Because pattern counts, when instantiated with verbs, could not be estimated directly, we have computed the frequencies of the patterns in a part-of-speech tagged 500M word corpus and used it to estimate the expected number of hits hits est (p) for each pattern.</Paragraph>
      <Paragraph position="6"> We estimated the N with a similar method. We say that the semantic relation S</Paragraph>
    </Section>
    <Section position="3" start_page="21" end_page="21" type="sub_section">
      <SectionTitle>
4.3 Pruning identified semantic relations
</SectionTitle>
      <Paragraph position="0"> Given a pair of semantic relations from the set we identify, one of three cases can arise: (i) one Table 2. Semantic relations and the 35 surface patterns used to identify them. Total number of patterns for that relation is shown in parentheses. In patterns, * matches any single word. Punctuation does not count as words by the search</Paragraph>
      <Paragraph position="2"> not only Xed but Yed not just Xed but Yed</Paragraph>
      <Paragraph position="4"> to X and then Y to X * and then Y Xed and then Yed Xed * and then Yed to X and later Y Xed and later Yed to X and subsequently Y Xed and subsequently Yed to X and eventually Y</Paragraph>
      <Paragraph position="6"> narrow- and broad- similarity overlap in their coverage and are treated as a single category, similarity, when postprocessed. Narrow similarity tests for rare patterns and hits est for it had to be approximated rather than estimated from the smaller corpus.</Paragraph>
      <Paragraph position="7"> relation is more specific (strength is more specific than similarity, enablement is more specific than happens-before), (ii) the relations are compatible (antonymy and happens-before), where presence of one does not imply or rule out presence of the other, and (iii) the relations are incompatible (similarity and antonymy).</Paragraph>
      <Paragraph position="8"> It is not uncommon for our algorithm to identify presence of several relations, with different strengths. To produce the most likely output, we use semantics of compatibility of the relations to output the most likely one(s). The rules are as follows: If the frequency was too low (less than 10 on the pattern X * Y OR Y * X OR X * * Y OR Y * * X), output that the statements are unrelated and stop.</Paragraph>
      <Paragraph position="9"> If happens-before is detected, output presence of happens-before (additional relation may still be output, if detected).</Paragraph>
      <Paragraph position="10"> If happens-before is not detected, ignore detection of enablement (because enablement is more specific than happens-before, but is sometimes falsely detected in the absence of happens-before). If strength is detected, score of similarity is ignored (because strength is more specific than similarity). null Of the relations strength, similarity, opposition and enablement which were detected (and not ignored), output the one with highest S p .</Paragraph>
      <Paragraph position="11"> If nothing has been output to this point, output unrelated.</Paragraph>
    </Section>
    <Section position="4" start_page="21" end_page="21" type="sub_section">
      <SectionTitle>
4.4 Extracting highly associated verb pairs
</SectionTitle>
      <Paragraph position="0"> To exhaustively test the more than 64 million unordered verb pairs for WordNets more than 11,000 verbs would be computationally intractable.</Paragraph>
      <Paragraph position="1"> Instead, we use a set of highly associated verb pairs output by a paraphrasing algorithm called DIRT (Lin and Pantel 2001). Since we are able to test up to 4000 verb pairs per day on a single machine (we issue at most 40 queries per test and each query takes approximately 0.5 seconds), we are able to test several dozen associated verbs for each verb in WordNet in a matter of weeks.</Paragraph>
      <Paragraph position="2"> Lin and Pantel (2001) describe an algorithm called DIRT (Discovery of Inference Rules from Text) that automatically learns paraphrase expressions from text. It is a generalization of previous algorithms that use the distributional hypothesis (Harris 1985) for finding similar words. Instead of applying the hypothesis to words, Lin and Pantel applied it to paths in dependency trees. Essentially, if two paths tend to link the same sets of words, they hypothesized that the meanings of the corresponding paths are similar. It is from paths of the form subject-verb-object that we extract our set of associated verb pairs. Hence, this paper is concerned only with relations between transitive verbs.</Paragraph>
      <Paragraph position="3"> A path, extracted from a parse tree, is an expression that represents a binary relation between two nouns. A set of paraphrases was generated for each pair of associated paths. For example, using a 1.5GB newspaper corpus, here are the 20 most associated paths to X solves Y generated by DIRT: Y is solved by X, X resolves Y, X finds a solution to Y, X tries to solve Y, X deals with Y, Y is resolved by X, X addresses Y, X seeks a solution to Y, X does something about Y, X solution to Y, Y is resolved in X, Y is solved through X, X rectifies Y, X copes with Y, X overcomes Y, X eases Y, X tackles Y, X alleviates Y, X corrects Y, X is a solution to Y, X makes Y worse, X irons out Y This list of associated paths looks tantalizingly close to the kind of axioms that would prove useful in an inference system. However, DIRT only outputs pairs of paths that have some semantic relation. We used these as our set to extract finer-grained relations.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="21" end_page="21" type="metho">
    <SectionTitle>
5 Experimental results
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="21" end_page="21" type="sub_section">
      <SectionTitle>
5.1 Experimental setup
</SectionTitle>
      <Paragraph position="0"> We studied 29,165 pairs of verbs. Applying DIRT to a 1.5GB newspaper corpus  , we extracted 4000 paths that consisted of single verbs in the relation subject-verb-object (i.e. paths of the form X verb Y) whose verbs occurred in at least 150 documents on the Web. For example, from the 20 most associated paths to X solves Y shown in Section 4.4, the following verb pairs were ex-</Paragraph>
    </Section>
    <Section position="2" start_page="21" end_page="21" type="sub_section">
      <SectionTitle>
5.2 Accuracy
</SectionTitle>
      <Paragraph position="0"> We classified each verb pair according to the semantic relations described in Section 2. If the system does not identify any semantic relation for a verb pair, then the system tags the pair as having  The 1.5GB corpus consists of San Jose Mercury, Wall Street Journal and AP Newswire articles from the TREC-9 collection.</Paragraph>
      <Paragraph position="1"> no relation. To evaluate the accuracy of the system, we randomly sampled 100 of these verb pairs, and presented the classifications to two human judges. The adjudicators were asked to judge whether or not the system classification was acceptable (i.e. whether or not the relations output by the system were correct). Since the semantic relations are not disjoint (e.g. mop is both stronger than and similar to sweep), multiple relations may be appropriately acceptable for a given verb pair. The judges were also asked to identify their preferred semantic relations (i.e. those relations which seem most plausible). Table 3 shows five randomly selected pairs along with the judges responses.</Paragraph>
      <Paragraph position="2"> The Appendix shows sample relationships discovered by the system.</Paragraph>
      <Paragraph position="3"> Table 4 shows the accuracy of the system. The baseline system consists of labeling each pair with the most common semantic relation, similarity, which occurs 33 times. The Tags Correct column represents the percentage of verb pairs whose system output relations were deemed correct. The Preferred Tags Correct column gives the percentage of verb pairs whose system output relations matched exactly the humans preferred relations.</Paragraph>
      <Paragraph position="4"> The Kappa statistic (Siegel and Castellan 1988) for the task of judging system tags as correct and incorrect is k = 0.78 whereas the task of identifying the preferred semantic relation has k = 0.72. For the latter task, the two judges agreed on 73 of the 100 semantic relations. 73% gives an idea of an upper bound for humans on this task. On these 73 relations, the system achieved a higher accuracy of 70.0%. The system is allowed to output the happens-before relation in combination with other relations. On the 17 happens-before relations output by the system, 67.6% were judged correct.</Paragraph>
      <Paragraph position="5"> Ignoring the happens-before relations, we achieved a Tags Correct precision of 68%.</Paragraph>
      <Paragraph position="6"> Table 5 shows the accuracy of the system on each of the relations. The stronger-than relation is a subset of the similarity relation. Considering a coarser extraction where stronger-than relations are merged with similarity, the task of judging system tags and the task of identifying the preferred semantic relation both jump to 68.2% accuracy. Also, the overall accuracy of the system climbs to 68.5%.</Paragraph>
      <Paragraph position="7"> As described in Section 2, WordNet contains verb semantic relations. A significant percentage of our discovered relations are not covered by WordNets coarser classifications. Of the 40 verb pairs whose system relation was tagged as correct by both judges in our accuracy experiments and whose tag was not no relation, only 22.5% of them existed in a WordNet relation.</Paragraph>
    </Section>
    <Section position="3" start_page="21" end_page="21" type="sub_section">
      <SectionTitle>
5.3 Discussion
</SectionTitle>
      <Paragraph position="0"> The experience of extracting these semantic relations has clarified certain important challenges.</Paragraph>
      <Paragraph position="1"> While relying on a search engine allows us to query a corpus of nearly a trillion words, some issues arise: (i) the number of instances has to be approximated by the number of hits (documents); (ii) the number of hits for the same query may fluctuate over time; and (iii) some needed counts are not directly available. We addressed the latter issue by approximating these counts using a smaller corpus.</Paragraph>
      <Paragraph position="2">  We do not detect entailment with lexico-syntactic patterns. In fact, we propose that whether the entailment relation holds between V  . For example, given the relation marry happens-before divorce, we can conclude that divorce entails marry. But, given the relation buy happens-before sell, we cannot conclude entailment since manufacture can also happen before sell. This also applies to the enablement and strength relations.</Paragraph>
      <Paragraph position="3"> Corpus-based methods, including ours, hold the promise of wide coverage but are weak on discriminating senses. While we hope that applications will benefit from this resource as is, an interesting next step would be to augment it with sense information.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="21" end_page="21" type="metho">
    <SectionTitle>
6 Future work
</SectionTitle>
    <Paragraph position="0"> There are several ways to improve the accuracy of the current algorithm and to detect relations between low frequency verb pairs. One avenue would be to automatically learn or manually craft more patterns and to extend the pattern vocabulary (when developing the system, we have noticed that different registers and verb types require different patterns). Another possibility would be to use more relaxed patterns when the part of speech confusion is not likely (e.g. eat is a common verb which does not have a noun sense, and patterns need not protect against noun senses when testing such verbs).</Paragraph>
    <Paragraph position="1"> Our approach can potentially be extended to multiword paths. DIRT actually provides two orders of magnitude more relations than the 29,165 single verb relations (subject-verb-object) we extracted. On the same 1GB corpus described in Section 5.1, DIRT extracted over 200K paths and 6M unique paraphrases. These provide an opportunity to create a much larger corpus of semantic relations, or to construct smaller, in-depth resources for selected subdomains. For example, we could extract that take a trip to is similar to travel to, and that board a plane happens before deplane.</Paragraph>
    <Paragraph position="2"> If the entire database is viewed as a graph, we currently leverage and enforce only local consistency. It would be useful to enforce global consistency, e.g. V  Finally, as discussed in Section 5.3, entailment relations may be derivable by processing the complete graph of the identified semantic relation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML