File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/n06-1005_metho.xml
Size: 20,883 bytes
Last Modified: 2025-10-06 14:10:05
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-1005"> <Title>Effectively Using Syntax for Recognizing False Entailment</Title> <Section position="3" start_page="33" end_page="34" type="metho"> <SectionTitle> 2 System Description </SectionTitle> <Paragraph position="0"> Similar to most other syntax-based approaches to recognizing textual entailment, we begin by representing each text and hypothesis sentence pair in logical forms. These logical forms are generated using NLPWIN3, a robust system for natural language parsing and generation (Heidorn, 2000).</Paragraph> <Paragraph position="1"> Our logical form representation may be considered equivalently as a set of triples of the form RELATION(nodei,nodej), or as a graph of syntactic dependencies; we use both terminologies interchangeably. Our algorithm proceeds as follows: 1. Parse each sentence with the NLPWIN parser, resulting in syntactic dependency graphs for the text and hypothesis sentences.</Paragraph> <Paragraph position="2"> 2. Attempt an alignment of each content node in the dependency graph of the hypothesis sentence to some node in the graph of the text sentence, using a set of heuristics for alignment (described in Section 3).</Paragraph> <Paragraph position="3"> 3. Using the alignment, apply a set of syntactic heuristics for recognizing false entailment (described in Section 4); if any match, predict that the entailment is false.</Paragraph> <Paragraph position="4"> 2(Vanderwende and Dolan, 2006) suggest that the truth or falsehood of 48% of the entailment examples in the RTE test set could be correctly identified via syntax and a thesaurus alone; thus by random guessing on the rest of the examples one might hope for an accuracy level of 0.48+ 0.522 = 74%.</Paragraph> <Paragraph position="5"> the sentence &quot;Six hostages in Iraq were freed.&quot; 4. If no syntactic heuristic matches, back off to a lexical similarity model (described in section 5.1), with an attempt to align detected paraphrases (described in section 5.2).</Paragraph> <Paragraph position="6"> In addition to the typical syntactic information provided by a dependency parser, the NLPWIN parser provides an extensive number of semantic features obtained from various linguistic resources, creating a rich environment for feature engineering. For example, Figure 1 (from Dev Ex. #616) illustrates the dependency graph representation we use, demonstrating the stemming, part-of-speech tagging, syntactic relationship identification, and semantic feature tagging capabilities of NLPWIN.</Paragraph> <Paragraph position="7"> We define a content node to be any node whose lemma is not on a small stoplist of common stop words. In addition to content vs. non-content nodes, among content nodes we distinguish between entities and nonentities: an entity node is any node classified by the NLPWIN parser as being a proper noun, quantity, or time.</Paragraph> <Paragraph position="8"> Each of the features of our system were developed from inspection of sentence pairs from the RTE development data set, and used in the final system only if they improved the system's accuracy on the development set (or improved F-score if accuracy was unchanged); sentence pairs in the RTE test set were left uninspected and used for testing purposes only. 3 Linguistic cues for node alignment Our syntactic heuristics for recognizing false entailment rely heavily on the correct alignment of words and multiword units between the text and hypothesis logical forms. In the notation below, we will consider h and t to be nodes in the hypothesis H and Hypothesis: 'Hepburn, who won four Oscars..' Text: 'Hepburn, a four-time Academy Award winer..' tional form alignment heuristics, Dev Ex. #767 text T logical forms, respectively. To accomplish the task of node alignment we rely on the following heuristics:</Paragraph> <Section position="1" start_page="34" end_page="34" type="sub_section"> <SectionTitle> 3.1 WordNet synonym match </SectionTitle> <Paragraph position="0"> As in (Herrera et al., 2005) and others, we align a node h [?] H to any node t [?] T that has both the same part of speech and belongs to the same synset in WordNet. Our alignment considers multiword units, including compound nouns (e.g., we align &quot;Oscar&quot; to &quot;Academy Award&quot; as in Figure 2), as well as verb-particle constructions such as &quot;set off&quot; (aligned to &quot;trigger&quot; in Test Ex. #1983).</Paragraph> </Section> <Section position="2" start_page="34" end_page="34" type="sub_section"> <SectionTitle> 3.2 Numeric value match </SectionTitle> <Paragraph position="0"> The NLPWIN parser assigns a normalized numeric value feature to each piece of text inferred to correspond to a numeric value; this allows us to align &quot;6th&quot; to &quot;sixth&quot; in Test Ex. #1175. and to align &quot;a dozen&quot; to &quot;twelve&quot; in Test Ex. #1231.</Paragraph> </Section> <Section position="3" start_page="34" end_page="34" type="sub_section"> <SectionTitle> 3.3 Acronym match </SectionTitle> <Paragraph position="0"> Many acronyms are recognized using the synonym match described above; nonetheless, many acronyms are not yet in WordNet. For these cases we have a specialized acronym match heuristic which aligns pairs of nodes with the following properties: if the lemma for some node h consists only of capitalized letters (with possible interceding periods), and the letters correspond to the first characters of some multiword lemma for some t [?] T, then we consider h and t to be aligned. This heuristic allows us to align &quot;UNDP&quot; to &quot;United Nations Development Programme&quot; in Dev Ex. #357 and &quot;ANC&quot; to &quot;African National Congress&quot; in Test Ex. #1300.</Paragraph> </Section> <Section position="4" start_page="34" end_page="34" type="sub_section"> <SectionTitle> 3.4 Derivational form match </SectionTitle> <Paragraph position="0"> We would like to align words which have the same root form (or have a synonym with the same root form) and which possess similar semantic meaning, but which may belong to different syntactic categories. We perform this by using a combination of the synonym and derivationally-related form information contained within WordNet. Explicitly our procedure for constructing the set of derivationally-related forms for a node h is to take the union of all derivationally-related forms of all the synonyms of h (including h itself), i.e.:</Paragraph> <Paragraph position="2"> In addition to the noun/verb derivationally-related forms, we detect adjective/adverb derivationally-related forms that differ only by the suffix 'ly'. Unlike the previous alignment heuristics, we do not expect that two nodes aligned via derivationally-related forms will play the same syntactic role in their respective sentences. Thus we consider two nodes aligned in this way to be soft-aligned, and we do not attempt to apply our false entailment recognition heuristics to nodes aligned in this way.</Paragraph> </Section> <Section position="5" start_page="34" end_page="34" type="sub_section"> <SectionTitle> 3.5 Country adjectival form / demonym match </SectionTitle> <Paragraph position="0"> As a special case of derivational form match, we soft-align matches from an explicit list of place names, adjectival forms, and demonyms4; e.g., &quot;Sweden&quot; and &quot;Swedish&quot; in Test Ex. #1576.</Paragraph> </Section> <Section position="6" start_page="34" end_page="34" type="sub_section"> <SectionTitle> 3.6 Other heuristics for alignment </SectionTitle> <Paragraph position="0"> In addition to these heuristics, we implemented a hyponym match heuristic similar to that discussed in (Herrera et al., 2005), and a heuristic based on the string-edit distance of two lemmas; however, these heuristics yielded a decrease in our system's accuracy on the development set and were thus left out of our final system.</Paragraph> </Section> </Section> <Section position="4" start_page="34" end_page="38" type="metho"> <SectionTitle> 4 Recognizing false entailment </SectionTitle> <Paragraph position="0"> The bulk of our system focuses on heuristics for recognizing false entailment. For purposes of notation, we define binary functions for the existence of each semantic node feature recognized by NLP-WIN; e.g., if h is negated, we state that NEG(h) = TRUE. Similarly we assign binary functions for the existence of each syntactic relation defined over pairs of nodes. Finally, we define the function ALIGN(h,t) to be true if and only if the node h [?] H has been 'hard-aligned' to the node t [?] T using one of the heuristics in Section 3. Other notation is defined in the text as it is used. Table 1 summarizes all heuristics used in our final system to recognize false entailment.</Paragraph> <Section position="1" start_page="35" end_page="35" type="sub_section"> <SectionTitle> 4.1 Unaligned entity </SectionTitle> <Paragraph position="0"> If some node h has been recognized as an entity (i.e., as a proper noun, quantity, or time) but has not been aligned to any node t, we predict that the entailment is false. For example, we predict that Test Ex. #1863 is false because the entities &quot;Suwariya&quot;, &quot;20 miles&quot;, and &quot;35&quot; in H are unaligned.</Paragraph> </Section> <Section position="2" start_page="35" end_page="35" type="sub_section"> <SectionTitle> 4.2 Negation mismatch </SectionTitle> <Paragraph position="0"> If any two nodes (h,t) are aligned, and one (and only one) of them is negated, we predict that the entailment is false. Negation is conveyed by the NEG feature in NLPWIN. This heuristic allows us to predict false entailment in the example &quot;Pertussis is not very contagious&quot; and &quot;...pertussis, is a highly contagious bacterial infection&quot; in Test Ex. #1144.</Paragraph> </Section> <Section position="3" start_page="35" end_page="35" type="sub_section"> <SectionTitle> 4.3 Modal auxiliary verb mismatch </SectionTitle> <Paragraph position="0"> If any two nodes (h,t) are aligned, and t is modified by a modal auxiliary verb (e.g, can, might, should, etc.) but h is not similarly modified, we predict that the entailment is false. Modification by a modal auxiliary verb is conveyed by the MOD feature in NLP-WIN. This heuristic allows us to predict false entailment between the text phrase &quot;would constitute a threat to democracy&quot;, and the hypothesis phrase &quot;constitutes a democratic threat&quot; in Test Ex. #1203.</Paragraph> </Section> <Section position="4" start_page="35" end_page="35" type="sub_section"> <SectionTitle> 4.4 Antonym match </SectionTitle> <Paragraph position="0"> If two aligned noun nodes (h1,t1) are both subjects or both objects of verb nodes (h0,t0) in their respective sentences, i.e., REL(h0,h1)[?] REL(t0,t1)[?] REL [?] {SUBJ,OBJ}, then we check for a verb antonym match between (h0,t0). We construct the set of verb antonyms using WordNet; we consider the antonyms of h0 to be the union of the antonyms of the first three senses of LEMMA(h0), or of the nearest antonym-possessing hypernyms if those senses do not themselves have antonyms in WordNet. Explicitly our procedure for constructing the antonym set of a node h0 is as follows: 1. ANTONYMS(h0) = {} 2. For each of the first three listed senses s of LEMMA(h0) in WordNet: (a) While |WN-ANTONYMS(s) |= 0 i. s - WN-HYPERNYM(s) (b) ANTONYMS(h0) - ANTONYMS(h0) [?] WN-ANTONYMS(s) 3. return ANTONYMS(h0) In addition to the verb antonyms in WordNet, we detect the prepositional antonym pairs (before/after, to/from, and over/under). This heuristic allows us to predict false entailment between &quot;Black holes can lose mass...&quot; and &quot;Black holes can regain some of their mass...&quot; in Test Ex. #1445.</Paragraph> </Section> <Section position="5" start_page="35" end_page="36" type="sub_section"> <SectionTitle> 4.5 Argument movement </SectionTitle> <Paragraph position="0"> For any two aligned verb nodes (h1,t1), we consider each noun child h2 of h1 possessing any of the subject, object, or indirect object relations to h1, i.e., there exists REL(h1,h2) such that REL [?] {SUBJ, OBJ, IND}. If there is some node t2 such that ALIGN(h2,t2), but REL(t1,t2) negationslash= REL(h1,h2), then we predict that the entailment is false.</Paragraph> <Paragraph position="1"> As an example, consider Figure 3, representing subgraphs from Dev Ex. #1916: T: ...U.N. officials are also dismayed that Aristide killed a conference called by Prime Minister Robert Malval... H: Aristide kills Prime Minister Robert Malval.</Paragraph> <Paragraph position="2"> Here let (h1,t1) correspond to the aligned verbs with lemma kill, where the object of h1 has lemma Prime Minister Robert Malval, and the object of t1 has lemma conference. Since h2 is aligned to some node t2 in the text graph, but !OBJ(t1,t2), the sentence pair is rejected as a false entailment.</Paragraph> </Section> <Section position="6" start_page="36" end_page="36" type="sub_section"> <SectionTitle> 4.6 Superlative mismatch </SectionTitle> <Paragraph position="0"> If some adjective node h1 in the hypothesis is identified as a superlative, check that all of the following conditions are satisfied: 1. h1 is aligned to some superlative t1 in the text sentence.</Paragraph> <Paragraph position="1"> 2. The noun phrase h2 modified by h1 is aligned to the noun phrase t2 modified by t1.</Paragraph> <Paragraph position="2"> 3. Any additional modifier t3 of the noun phrase t2 is aligned to some modifier h3 of h2 in the hypothesis sentence (reverse subset match). If any of these conditions are not satisfied, we predict that the entailment is false. This heuristic allows us to predict false entailment in (Dev Ex. #908): T: Time Warner is the world's largest media and Internet company. null H: Time Warner is the world's largest company. Here &quot;largest media and Internet company&quot; in T fails the reverse subset match (condition 3) to &quot;largest company&quot; in H.</Paragraph> </Section> <Section position="7" start_page="36" end_page="36" type="sub_section"> <SectionTitle> 4.7 Conditional mismatch </SectionTitle> <Paragraph position="0"> For any pair of aligned nodes (h1,t1), if there exists a second pair of aligned nodes (h2,t2) such that the shortest path PATH(t1,t2) in the dependency graph T contains the conditional relation, then PATH(h1,h2) must also contain the conditional relation, or else we predict that the entailment is false. For example, consider the following false entailment (Dev Ex. #60): T: If a Mexican approaches the border, he's assumed to be trying to illegally cross.</Paragraph> <Paragraph position="1"> H: Mexicans continue to illegally cross border.</Paragraph> <Paragraph position="2"> Here, &quot;Mexican&quot; and &quot;cross&quot; are aligned, and the path between them in the text contains the conditional relation, but does not in the hypothesis; thus the entailment is predicted to be false.</Paragraph> </Section> <Section position="8" start_page="36" end_page="36" type="sub_section"> <SectionTitle> 4.8 Other heuristics for false entailment </SectionTitle> <Paragraph position="0"> In addition to these heuristics, we additionally implemented an IS-A mismatch heuristic, which attempted to discover when an IS-A relation in the hy-</Paragraph> </Section> <Section position="9" start_page="36" end_page="37" type="sub_section"> <SectionTitle> 5.1 Lexical similarity using MindNet </SectionTitle> <Paragraph position="0"> In case none of the preceding heuristics for rejection are applicable, we back off to a lexical similarity model similar to that described in (Glickman et al., 2005). For every content node h [?] H not already aligned by one of the heuristics in Section 3, we obtain a similarity score MN(h,t) from a similarity database that is constructed automatically from the data contained in MindNet5 as described in (Richardson, 1997). Our similarity function is thus:</Paragraph> <Paragraph position="2"> Where the minimum score min is a parameter tuned for maximum accuracy on the development set; min = 0.00002 in our final system. We then compute the entailment score:</Paragraph> <Paragraph position="4"> This approach is identical to that used in (Glickman et al., 2005), except that we use alignment heuristics and MindNet similarity scores in place of their web-based estimation of lexical entailment probabilities, and we take as our score the geometric mean of the component entailment scores rather than the unnormalized product of probabilities.</Paragraph> </Section> <Section position="10" start_page="37" end_page="38" type="sub_section"> <SectionTitle> 5.2 Measuring phrasal similarity using the web </SectionTitle> <Paragraph position="0"> The methods discussed so far for alignment are limited to aligning pairs of single words or multiple-word units constituting single syntactic categories; these are insufficient for the problem of detecting more complicated paraphrases. For example, consider the following true entailment (Dev Ex. #496): T: ...Muslims believe there is only one God.</Paragraph> <Paragraph position="1"> H: Muslims are monotheistic.</Paragraph> <Paragraph position="2"> Here we would like to align the hypothesis phrase &quot;are monotheistic&quot; to the text phrase &quot;believe there is only one God&quot;; unfortunately, single-node alignment aligns only the nodes with lemma &quot;Muslim&quot;. In this section we describe the approach used in our system to approximate phrasal similarity via distributional information obtained using the MSN Search search engine.</Paragraph> <Paragraph position="3"> We propose a metric for measuring phrasal similarity based on a phrasal version of the distributional hypothesis: we propose that a phrase template Ph (e.g. 'xh are monotheistic') has high semantic similarity to a template Pt (e.g. &quot;xt believe there is only one God&quot;), with possible &quot;slot-fillers&quot; xh and xt, respectively, if the overlap of the sets of observed slot-fillers Xh [?]Xt for those phrase templates is high in some sufficiently large corpus (e.g., the Web).</Paragraph> <Paragraph position="4"> To measure phrasal similarity we issue the surface text form of each candidate phrase template as a query to a web-based search engine, and parse the returned sentences in which the candidate phrase occurs to determine the appropriate slot-fillers. For example, in the above example, we observe the set of slot-fillers Xt = {Muslims, Christians, Jews, Saivities, Sikhs, Caodaists, People}, and Xh [?] Xt = {Muslims, Christians, Jews, Sikhs, People}.</Paragraph> <Paragraph position="5"> Explicitly, given the text and hypothesis logical forms, our algorithm proceeds as follows to compute the phrasal similarity between all phrase templates in H and T: 1. For each pair of aligned single node and unaligned leaf node (t1,tl) (or pair of aligned nodes (t1,t2)) in the text T: (a) Use NLPWIN to generate a surface text string S from the underlying logical form PATH(t1,t2).</Paragraph> <Paragraph position="6"> (b) Create the surface string template phrase Pt by removing from S the lemmas corresponding to t1 (and t2, if path is between aligned nodes).</Paragraph> <Paragraph position="7"> (c) Perform a web search for the string Pt.</Paragraph> <Paragraph position="8"> (d) Parse the resulting sentences containing Pt and extract all non-pronoun slot fillers xt [?] Xt that satisfy the same syntactic roles as t1 in the original sentence.</Paragraph> <Paragraph position="9"> 2. Similarly, extract the slot fillers Xh for each discovered phrase template Ph in H.</Paragraph> <Paragraph position="10"> 3. Calculate paraphrase similarity as a function of the overlap between the slot-filler sets Xt and Xh, i.e: score(Ph,Pt) = |Xh[?]Xt||Xt |.</Paragraph> <Paragraph position="11"> We then incorporate paraphrase similarity within the lexical similarity model by allowing, for some unaligned node h [?] Ph, where t [?] Pt:</Paragraph> <Paragraph position="13"> Our approach to paraphrase detection is most similar to the TE/ASE algorithm (Szpektor et al., 2004), and bears similarity to both DIRT (Lin and Pantel, 2001) and KnowItAll (Etzioni et al., 2004). The chief difference in our algorithm is that we generate the surface text search strings from the parsed logical forms using the generation capabilities of NLPWIN (Aikawa et al., 2001), and we verify that the syntactic relations in each discovered web snippet are isomorphic to those in the original candidate paraphrase template.</Paragraph> </Section> </Section> class="xml-element"></Paper>