XML Viewer - c02-1042

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/c02-1042_metho.xml
Size: 16,495 bytes
Last Modified: 2025-10-06 14:07:44
<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1042">
  <Title>Using Knowledge to Facilitate Factoid Answer Pinpointing</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. Webclopedia Architecture
</SectionTitle>
    <Paragraph position="0"> As shown in Figure 1, Webclopedia adopts the Use-Knowledge architecture. Its modules are described in more detail in (Hovy et al., 2001; Hovy et al., 1999): * Question parsing: Using BBN's IdentiFinder (Bikel et al., 1999), the CONTEX parser (Hermjakob, 1997) produces a syntactic-semantic analysis of the question and determines the QA type.</Paragraph>
    <Paragraph position="1"> * Query formation: Single- and multi-word units (content words) are extracted from the analysis, and WordNet synsets (Fellbaum, 1998) are used for query expansion. A series of Boolean queries of decreasing specificity is formed.</Paragraph>
    <Paragraph position="2"> * IR: The publicly available IR engine MG (Witten et al., 1994) returns the top-ranked N documents.</Paragraph>
    <Paragraph position="3"> * Selecting and ranking sentences: For each document, the most promising K sentences are located and scored using a formula that rewards word and phrase overlap with the question and its expanded query words. Results are ranked.</Paragraph>
    <Paragraph position="4">  * Parsing candidates: CONTEX parses the top-ranked 300 sentences.</Paragraph>
    <Paragraph position="5"> * Pinpointing: As described in Section 3, a number of knowledge resources are used to perform filtering/pinpointing operations. * Ranking of answers: The candidate answers' scores are compared and the winner(s) are output.</Paragraph>
    <Paragraph position="6"> 3. Knowledge Used for Pinpointing</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Type 1: Question Word Matching
</SectionTitle>
      <Paragraph position="0"> Unlike (Prager et al., 1999), we do not first annotate the source corpus, but perform IR directly on the source text, using MG (Witten et al., 1994). To determine goodness, we assign an initial base score to each retrieved sentence. We then compare the sentence to the question and adapt this score as follows: * exact matches of proper names double the base score.</Paragraph>
      <Paragraph position="1"> * matching an upper-cased term adds a 60% bonus of the base score for multi-words terms and 30% for single words (matching &amp;quot;United States&amp;quot; is better than just &amp;quot;United&amp;quot;). * matching a WordNet synonym of a term discounts by 10% (lower case) and 50% (upper case). (When &amp;quot;Cage&amp;quot; matches &amp;quot;cage&amp;quot;, the former may be the last name of a person and the latter an object; the case mismatch signals less reliability.) * lower-case term matches after Porter stemming are discounted 30%; upper-case matches 70% (Porter stemming is more aggressive than WordNet stemming).</Paragraph>
      <Paragraph position="2"> * Porter stemmer matches of both question and sentence words with lower case are discounted 60%; with upper case, 80%.</Paragraph>
      <Paragraph position="3"> * if CONTEX indicates a term as being qsubsumed (see Section 3.9) the term is discouned 90% (in &amp;quot;Which country manufactures weapons of mass destruction?&amp;quot;, &amp;quot;country&amp;quot; will be marked as qsubsumed).</Paragraph>
      <Paragraph position="4"> The top-scoring 300 sentences are passed on for further filtering.</Paragraph>
      <Paragraph position="5"> 3.2 Type 2: Qtargets, the QA Typology, and the Semantic Ontology We classify desired answers by their semantic type, which have been taxonomized in the  http://www.isi.edu/natural-language/projects/we bclopedia/Taxonomy/taxonomy_toplevel.html). The currently approx. 180 classes, which we call qtargets, were developed after an analysis of over 17,000 questions (downloaded in 1999 from answers.com) and later enhancements to Webclopedia. They are of several types: * common semantic classes such as PROPER-PERSON, EMAIL-ADDRESS, LOCATION, PROPER-ORGANIZATION; * classes particular to QA such as YES:NO, ABBREVIATION-EXPANSION, and WHYFAMOUS; null * syntactic classes such as NP and NOUN, when no semnatic type can be determined (e.g., &amp;quot;What does Peugeot manufacture?&amp;quot;); * roles and slots, such as REASON and TITLE-P respectively, to indicate a desired relation with an anchoring concept.</Paragraph>
      <Paragraph position="6"> Given a question, the CONTEX parser uses a set of 276 hand-built rules to identify its most likely qtarget(s), and records them in a backoff scheme (allowing more general qtarget nodes to apply when more specific ones fail to find a match). The generalizations are captured in a typical concept ontology, a 10,000-node extract of WordNet.</Paragraph>
      <Paragraph position="7"> The recursive part of pattern matching is driven mostly by interrogative phrases. For example, the rule that determines the applicability of the qtarget WHY-FAMOUS requires the question word &amp;quot;who&amp;quot;, followed by the copula, followed by a proper name. When there is no match at the current level, the system examines any interrogative constituent, or words in special relations to it. For example, the qtarget TEMPERATURE-QUANTITY (as in &amp;quot;What is the melting point of X?&amp;quot; requires as syntactic object something that in the ontology is subordinate to TEMP-QUANTIFIABLE-ABS-TRACT with, as well, the word &amp;quot;how&amp;quot; paired with &amp;quot;warm&amp;quot;, &amp;quot;cold&amp;quot;, &amp;quot;hot&amp;quot;, etc., or the phrase &amp;quot;how many degrees&amp;quot; and a TEMPERATURE-UNIT (as defined in the ontology).</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Type 3: Surface Pattern Matching
</SectionTitle>
      <Paragraph position="0"> Often qtarget answers are expressed using rather stereotypical words or phrases. For example, the year of birth of a person is typically expressed using one of these phrases: &lt;name&gt; was born in &lt;birthyear&gt; &lt;name&gt; (&lt;birthyear&gt;-&lt;deathyear&gt;) We have developed a method to learn such patterns automatically from text on the web (Ravichandran and Hovy, 2002). We have added into the QA Typology the patterns for appropriate qtargets (qtargets with closed-list answers, such as PLANETS, require no patterns). Where some QA systems use such patterns exclusively (Soubbotin and Soubbotin, 2001) or partially (Wang et al., 2001; Lee et al., 2001), we employ them as an additional source of evidence for the answer. Preliminary results on for a range of qtargets, using the TREC-10 questions and the TREC corpus, are:  underspecified and rely on culturally shared cooperativeness rules and/or world knowledge: Q: How many people live in Chile? S1: &amp;quot;From our correspondent comes good news about the nine people living in Chile...&amp;quot; A1: nine While certainly nine people do live in Chile, we know what the questioner intends. We have hand-implemented a rule that provides default range assumptions for POPULATION questions and biases quantity questions accordingly.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3.5 Type 5: Abbreviation Expansion
</SectionTitle>
    <Paragraph position="0"> Abbreviations often follow a pattern: Q: What does NAFTA stand for? S1: &amp;quot;This range of topics includes the North American Free Trade Agreement, NAFTA, and the world trade agreement GATT.&amp;quot; S2: &amp;quot;The interview now changed to the subject of trade and pending economic issues, such as the issue of opening the rice market, NAFTA, and the issue of Russia repaying economic cooperation funds.&amp;quot; After Webclopedia identifies the qtarget as ABBREVIATION-EXPANSION, it extracts possible answer candidates, including &amp;quot;North American Free Trade Agreement&amp;quot; from S1 and &amp;quot;the rice market&amp;quot; from S2. Rules for acronym matching easily prefer the former.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.6 Type 6: Semantic Type Matching
</SectionTitle>
      <Paragraph position="0"> Phone numbers, zip codes, email addresses, URLs, and different types of quantities obey lexicographic patterns that can be exploited for matching, as in Q: What is the zip code for Fremont, CA? S1: &amp;quot;...from Everex Systems Inc., 48431 Milmont Drive, Fremont, CA 94538.&amp;quot; and Q: How hot is the core of the earth? S1. &amp;quot;The temperature of Earth's inner core may be as high as 9,000 degrees Fahrenheit (5,000 degrees Celsius).&amp;quot; Webclopedia identifies the qtargets respectively as ZIP-CODE and TEMPERATURE-QUANTITY.</Paragraph>
      <Paragraph position="1"> Approx. 30 heuristics (cascaded) apply to the input before parsing to mark up numbers and other orthographically recognizable units of all kinds, including (likely) zip codes, quotations, year ranges, phone numbers, dates, times, scores, cardinal and ordinal numbers, etc.</Paragraph>
      <Paragraph position="2"> Similar work is reported in (Kwok et al., 2001).</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.7 Type 7: Definitions from WordNet
</SectionTitle>
      <Paragraph position="0"> We have found a 10% increase in accuracy in answering definition questions by using external glosses obtained from WordNet. For Q: What is the Milky Way? Webclopedia identified two leading answer candidates: A1: outer regions A2: the galaxy that contains the Earth Comparing these with the WordNet gloss: WordNet: &amp;quot;Milky Way--the galaxy containing the solar system&amp;quot; allows Webclopedia to straightforwardly match the candidate with the greater word overlap. Curiously, the system also needs to use WordNet to answer questions involving common knowledge, as in: Q: What is the capital of the United States? because authors of the TREC collection do not find it necessary to explain what Washington is: Ex: &amp;quot;Later in the day, the president returned to Washington, the capital of the United States.&amp;quot;</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
While WordNet's definition
Wordnet: &amp;quot;Washington--the capital of the
United States&amp;quot;
</SectionTitle>
      <Paragraph position="0"> directly provides the answer to the matcher, it also allows the IR module to focus its search on passages containing &amp;quot;Washington&amp;quot;, &amp;quot;capital&amp;quot;, and &amp;quot;United States&amp;quot;, and the matcher to pick a good motivating passage in the source corpus.</Paragraph>
      <Paragraph position="1"> Clearly, this capability can be extended to include (definitional and other) information provided by other sources, including encyclopedias and the web (Lin 2002).</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.8 Type 8: Semantic Relation Matching
</SectionTitle>
      <Paragraph position="0"> So far, we have considered individual words and groups of words. But often this is insufficient to accurately score an answer. As also noted in (Buchholz, 2001), pinpointing can be improved significantly by matching semantic relations among constituents: Q: Who killed Lee Harvey Oswald?</Paragraph>
      <Paragraph position="2"> S1: &amp;quot;Belli's clients have included Jack Ruby, who killed John F. Kennedy assassin Lee Harvey Oswald, and Jim and Tammy Bakker.&amp;quot; S2: &amp;quot;On Nov. 22, 1963, the building gained national notoriety when Lee Harvey Oswald allegedly shot and killed President John F.</Paragraph>
      <Paragraph position="3"> Kennedy from a sixth floor window as the presidential motorcade passed.&amp;quot; The CONTEX parser (Hermjakob, 1997; 2001) provides the semantic relations. The parser uses machine learning techniques to build a robust grammar that produces semantically annotated syntax parses of English (and Korean and Chinese) sentences at approx. 90% accuracy (Hermjakob, 1999).</Paragraph>
      <Paragraph position="4"> The matcher compares the parse trees of S1 and S2 to that of the question. Both S1 and S2 receive credit for matching question words &amp;quot;Lee Harvey Oswald&amp;quot; and &amp;quot;kill&amp;quot; (underlined), as well as for finding an answer (bold) of the proper qtarget type (PROPER-PERSON). However, is the answer &amp;quot;Jack Ruby&amp;quot; or &amp;quot;President John F. Kennedy&amp;quot;? The only way to determine this is to consider the semantic relationship between these candidates and the verb &amp;quot;kill&amp;quot; (parse trees simplified, and only portions shown here):  Although the PREDs of both S1 and S2 match that of the question &amp;quot;killed&amp;quot;, only S1 matches &amp;quot;Lee Harvey Oswald&amp;quot; as the head of the logical OBJect. Thus for S1, the matcher awards additional credit to node [2] (Jack Ruby) for being the logical SUBJect of the killing (using anaphora resolution). In S2, the parse tree correctly records that node [13] (&amp;quot;John F.</Paragraph>
      <Paragraph position="5"> Kennedy&amp;quot;) is not the object of the killing. Thus despite its being closer to &amp;quot;killed&amp;quot;, the candidate in S2 receives no extra credit from semantic relation matching.</Paragraph>
      <Paragraph position="6"> It is important to note that the matcher awards extra credit for each matching semantic relationship between two constituents, not only when everything matches. This granularity improves robustness in the case of partial matches.</Paragraph>
      <Paragraph position="7"> Semantic relation matching applies not only to logical subjects and objects, but also to all other roles such as location, time, reason, etc. (for additional examples see http://www.isi.edu/  natural-language/projects/webclopedia/sem-relexamples.html). It also applies at not only the sentential level, but at all levels, such as post-modifying prepositional and pre-modifying determiner phrases Additionally, Webclopedia uses 10 lists of word variations with a total of 4029 entries for semantically related concepts such as &amp;quot;to invent&amp;quot;, &amp;quot;invention&amp;quot; and &amp;quot;inventor&amp;quot;, and rules for handling them. For example, via coercing &amp;quot;invention&amp;quot; to &amp;quot;invent&amp;quot;, the system can give &amp;quot;Johan Vaaler&amp;quot; extra credit for being a likely logical subject of &amp;quot;invention&amp;quot;: Q: Who invented the paper clip?  while &amp;quot;David&amp;quot; actually loses points for being outside of the clausal scope of the inventing: S2: &amp;quot;'Like the guy who invented the safety pin, or the guy who invented the paper clip,' David added.&amp;quot;</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.9 Type 9: Word Window Scoring
</SectionTitle>
      <Paragraph position="0"> Webclopedia also includes a typical window-based scoring module that moves a window over the text and assigns a score to each window position depending on a variety of criteria (Hovy et al., 1999). Unlike (Clarke et al., 2001; Lee et al., 2001; Chen et al., 2001), we have not developed a very sophisticated scoring function, preferring to focus on the modules that employ information deeper than the word level.</Paragraph>
      <Paragraph position="1"> This method is applied only when no other method provides a sufficiently high-scoring answer. The window scoring function is</Paragraph>
      <Paragraph position="3"> ] Factors: w: window width (modulated by gaps of various lengths: &amp;quot;white house&amp;quot; [?] &amp;quot;white car and house&amp;quot;), r: rank of qtarget in list returned by CONTEX, I: window word information content (inverse log frequency score of each word), summed, q: # different question words matched, plus specific rewards (bonus q=3.0), e: penalty if word matches one of question word's WordNet synset items (e=0.8), b: bonus for matching main verb, proper names, certain target words (b=2.0), u: (value 0 or 1) indicates whether a word has been qsubsumed (&amp;quot;subsumed&amp;quot; by the qtarget) and should not contribute (again) to the score. For example, &amp;quot;In what year did Columbus discover America?&amp;quot; the qsubsumed words are &amp;quot;what&amp;quot; and &amp;quot;year&amp;quot;.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML