File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/n06-1007_metho.xml

Size: 12,376 bytes

Last Modified: 2025-10-06 14:10:06

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1007">
  <Title>Acquisition of Verb Entailment from Text</Title>
  <Section position="4" start_page="49" end_page="50" type="metho">
    <SectionTitle>
3 Verb Entailment
</SectionTitle>
    <Paragraph position="0"> Verb entailment relations have been traditionally attracting a lot of interest from lexical semantics research and their various typologies have been proposed (see, e.g., (Fellbaum, 1998)). In this study, with the view of potential practical applications, we adopt an operational definition of entailment. We define it to be a semantic relation between verbs where one verb, termed premise P, refers to event Ep and at the same time implies event Eq, typically denoted by the other verb, termed consequence Q.</Paragraph>
    <Paragraph position="1"> The goal of verb entailment acquisition is then to find two linguistic templates each consisting of a verb and slots for its syntactic arguments. In the pair, (1) the verbs are related in accordance with our definition of entailment above, (2) there is a mapping between the slots of the two templates and  (3) the direction of entailment is indicated explic null itly. For example, in the template pair &amp;quot;buy(obj:X) = belong(subj:X)&amp;quot; the operator = specifies that the premise buy entails the consequence belong, and X indicates a mapping between the object of buy and the subject of belong, as in The company bought shares. - The shares belong to the company.</Paragraph>
    <Paragraph position="2"> As opposed to logical entailment, we do not require that verb entailment holds in all conceivable contexts and view it as a relation that may be more plausible in some contexts than others. For each verb pair, we therefore wish to assign a score quantifying the likelihood of its satisfying entailment in some random context.</Paragraph>
  </Section>
  <Section position="5" start_page="50" end_page="51" type="metho">
    <SectionTitle>
4 Approach
</SectionTitle>
    <Paragraph position="0"> The key assumption behind our approach is that the ability of a verb to imply an event typically denoted by a different verb manifests itself in the regular co-occurrence of the two verbs inside locally coherent text. This assumption is not arbitrary: as discourse investigations show (Asher and Lascarides, 2003), (Hobbs, 1985), lexical entailment plays an important role in determining the local structure of discourse. We expect this co-occurrence regularity to be equally characteristic of any pair of verbs related by entailment, regardless of is type and the syntactic behavior of verbs.</Paragraph>
    <Paragraph position="1"> The method consists of three major steps. First, it identifies pairs of clauses that are related in the local discourse. From related clauses, it then creates templates by extracting pairs of verbs along with relevant information as to their syntactic behavior. Third, the method scores each verb pair in terms of plausibility of entailment by measuring how strongly the premise signals the appearance of the consequence inside the text segment at hand. In the following sections, we describe these steps in more detail.</Paragraph>
    <Section position="1" start_page="50" end_page="50" type="sub_section">
      <SectionTitle>
4.1 Identifying discourse-related clauses
</SectionTitle>
      <Paragraph position="0"> We attempt to capture local discourse relatedness between clauses by a combination of several surface cues. In doing so, we do not build a full discourse representation of text, nor do we try to identify the type of particular rhetorical relations between sentences, but rather identify pairs of clauses that are likely to be discourse-related.</Paragraph>
      <Paragraph position="1"> Textual proximity. We start by parsing the corpus with a dependency parser (we use Connexor's FDG (Tapanainen and J&amp;quot;arvinen, 1997)), treating every verb with its dependent constituents as a clause. For two clauses to be discourse-related, we require that they appear close to each other in the text. Adjacency of sentences has been previously used to model local coherence (Lapata, 2003). To capture related clauses within larger text fragments, we experiment with windows of text of various sizes around a clause.</Paragraph>
      <Paragraph position="2"> Paragraph boundaries. Since locally related sentences tend to be grouped into paragraphs, we further require that the two clauses appear within the same paragraph.</Paragraph>
      <Paragraph position="3"> Common event participant. Entity-based theories of discourse (e.g., (Grosz et al., 1995)) claim that a coherent text segment tends to focus on a specific entity. This intuition has been formalized by (Barzilay and Lapata, 2005), who developed an entity-based statistical representation of local discourse and showed its usefulness for estimating coherence between sentences. We also impose this as a criterion for two clauses to be discourse-related: their arguments need to refer to the same participant, henceforth, anchor. We identify the anchor as the same noun lemma appearing as an argument to the verbs in both clauses, considering only subject, object, and prepositional object arguments. The anchor must not be a pronoun, since identical pronouns may refer to different entities and making use of such correspondences is likely to introduce noise.</Paragraph>
    </Section>
    <Section position="2" start_page="50" end_page="51" type="sub_section">
      <SectionTitle>
4.2 Creating templates
</SectionTitle>
      <Paragraph position="0"> Once relevant clauses have been identified, we create pairs of syntactic templates, each consisting of a verb and the label specifying the syntactic role the anchor occupies near the verb. For example, given a pair of clauses Mary bought a house. and The house belongs to Mary., the method will extract two pairs of templates: {buy(obj:X), belong(subj:X)} and {buy(subj:X), belong(to:X).} Before templates are constructed, we automatically convert complex sentence parses to simpler, but semantically equivalent ones so as to increase the amount of usable data and reduce noise: * Passive constructions are turned into active  turned into predicate structures: the group led by A - A leads the group; the group leading the market - the group leads the market.</Paragraph>
      <Paragraph position="1"> The output of this step is V [?] P xQ, a set of pairs of templates {p,q}, where p [?] P is the premise, consisting of the verb vp and rp - the syntactic relation between vp and the anchor, and q [?] Q is the consequence, consisting of the verb vq and rq - its syntactic relation to the anchor.</Paragraph>
    </Section>
    <Section position="3" start_page="51" end_page="51" type="sub_section">
      <SectionTitle>
4.3 Measuring asymmetric association
</SectionTitle>
      <Paragraph position="0"> To score the pairs for asymmetric association, we use a procedure similar to the method by (Resnik, 1993) for learning selectional preferences of verbs.</Paragraph>
      <Paragraph position="1"> Each template in a pair is tried as both a premise and a consequence. We quantify the 'preference' of the premise p for the consequence q as the contribution of q to the amount of information p contains about its consequences seen in the data. First, we calculate Kullback-Leibler Divergence (Cover.</Paragraph>
      <Paragraph position="2"> and Thomas, 1991) between two probability distributions, u - the prior distribution of all consequences in the data and w - their posterior distribution given p, thus measuring the information p contains about its consequences:</Paragraph>
      <Paragraph position="4"> over all consequences in the data. Then, the score for template {p,q} expressing the association of q with p is calculated as the proportion of q's contribution</Paragraph>
      <Paragraph position="6"> In each pair we compare the scores in both directions, taking the direction with the greater score to indicate the most likely premise and consequence and thus the direction of entailment.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="51" end_page="52" type="metho">
    <SectionTitle>
5 Evaluation Design
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="51" end_page="51" type="sub_section">
      <SectionTitle>
5.1 Task
</SectionTitle>
      <Paragraph position="0"> To evaluate the algorithm, we designed a recognition task similar to that of pseudo-word disambiguation (Sch&amp;quot;utze, 1992), (Dagan et al., 1999). The task was, given a certain premise, to select its correct consequence out of a pool with several artificially created incorrect alternatives.</Paragraph>
      <Paragraph position="1"> The advantages of this evaluation technique are twofold. On the one hand, the task mimics many possible practical applications of the entailment resource, such as sentence ordering, where, given a sentence, it is necessary to identify among several alternatives another sentence that either entails or is entailed by the given sentence. On the other hand, in comparison with manual evaluation of the direct output of the system, it requires minimal human involvement and makes it possible to conduct large-scale experiments.</Paragraph>
    </Section>
    <Section position="2" start_page="51" end_page="52" type="sub_section">
      <SectionTitle>
5.2 Data
</SectionTitle>
      <Paragraph position="0"> The experimental material was created from the BLLIP corpus, a collection of texts from the Wall Street Journal (years 1987-89). We chose 15 transitive verbs with the greatest corpus frequency and used a pilot run of our method to extract 1000 highest-scoring template pairs involving these verbs as a premise. From them, we manually selected 129 template pairs that satisfied entailment.</Paragraph>
      <Paragraph position="1"> For each of the 129 template pairs, four false consequences were created. This was done by randomly picking verbs with frequency comparable to that of the verb of the correct consequence. A list of parsed clauses from the BLLIP corpus was consulted to select the most typical syntactic configuration of each of the four false verbs. The resulting five template pairs, presented in a random order, constituted a test item. Figure 1 illustrates such a test item.</Paragraph>
      <Paragraph position="2"> The entailment acquisition method was evaluated on entailment templates acquired from the British National Corpus. Even though the two corpora are quite different in style, we assume that the evaluation allows conclusions to be drawn as to the relative quality of performance of the methods under consideration. null  plate pair with the correct consequence is marked by an asterisk.</Paragraph>
    </Section>
    <Section position="3" start_page="52" end_page="52" type="sub_section">
      <SectionTitle>
5.3 Recognition algorithm
</SectionTitle>
      <Paragraph position="0"> During evaluation, we tested the ability of the method to select the correct consequence among the five alternatives. Our entailment acquisition method generates association scores for one-slot templates.</Paragraph>
      <Paragraph position="1"> In order to score the double-slot templates in the evaluation material, we used the following procedure. null Given a double-slot template, we divide it into two single-slot ones such that matching arguments of the two verbs along with the verbs themselves constitute a separate template. For example, &amp;quot;buy (subj:X, obj:Y) = own (subj:X, obj:Y)&amp;quot; will be decomposed into &amp;quot;buy (subj:X) = own (subj:X)&amp;quot; and &amp;quot;buy (obj:Y) = own (obj:Y)&amp;quot;. The scores of these two templates are then looked up in the generated database and averaged. In each test item, the five alternatives are scored in this manner and the one with the highest score was chosen as containing the correct consequence.</Paragraph>
      <Paragraph position="2"> The performance was measured in terms of accuracy, i.e. as the ratio of correct choices to the total number of test items. Ties, i.e. cases when the correct consequence was assigned the same score as one or more incorrect ones, contributed to the final accuracy measure proportionate to the number of tying alternatives.</Paragraph>
      <Paragraph position="3"> This experimental design corresponds to a random baseline of 0.2, i.e. the expected accuracy when selecting a consequence template randomly out of 5 alternatives.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML