File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/w02-1111_metho.xml
Size: 16,551 bytes
Last Modified: 2025-10-06 14:08:03
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-1111"> <Title>Fine-Grained Proper Noun Ontologies for Question Answering</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Ontologies for Question Answering </SectionTitle> <Paragraph position="0"> Modern question answering systems rely heavily on the fact that questions contain strong preferences for The 1974 film 'That's Entertainment!' was made from film clips from what Hollywood studio? What king of Babylonia reorganized the empire under the Code that bears his name? What rock 'n' roll musician was born Richard Penniman on Christmas Day? What is the oldest car company which still exists today? What was the name of the female Disco singer who scored with the tune 'Dim All the Lights' in 1979? What was the name of the first Russian astronaut to do a spacewalk? What was the name of the US helicopter pilot shot down over North Korea? Which astronaut did Tom Hanks play in 'Apollo 13'? Which former Klu Klux Klan member won an elected office in the U.S.? Who's the lead singer of the Led Zeppelin band? Who is the Greek goddess of retribution or vengeance? Who is the prophet of the religion of Islam? Who is the author of the book, &quot;The Iron Lady: A Biography of Margaret Thatcher&quot;? Who was the lead actress in the movie &quot;Sleepless in Seattle&quot;? the types of answers they expect. Kupiec (1993) observes that the WH word itself provides preferences (e.g. &quot;Who&quot; questions prefer PERSON answers).</Paragraph> <Paragraph position="1"> He further observes that questions also include type preferences in other parts of the question. Sometimes these preferences occur within the WH phrase (&quot;what color&quot;), and sometimes they are embedded elsewhere within the question (&quot;what is the color ...&quot;). In both, the question indicates a preference for colors as answers.</Paragraph> <Paragraph position="2"> Current question answering systems use ontologies when these type preferences are detected. One simple method is as follows: when a type preference is recognized, the preference is located within the WordNet ontology, and children of that synset are treated as potential answers. Given the question &quot;In pool, what color is the eight ball?&quot;, and the ontology excerpt shown in Figure 1, the system can narrow down the range of choices. This approach has high precision: if the type preference can be located, and a candidate answer is found in a child node (in a suitable corpus context), then the candidate is likely to be the answer.</Paragraph> <Paragraph position="3"> Harabagiu et al. (2000) proposes another method for using an ontology: WordNet subtrees are linked to types recognized by a named entity recognizer.</Paragraph> <Paragraph position="4"> Their system works as follows: given the question &quot;What is the wingspan of a condor?&quot;, it locates &quot;wingspan&quot; in the WordNet ontology. It then detects that &quot;wingspan&quot; falls into the MAGNITUDE subtree which is linked to the QUANTITY type. This links words in the MAGNITUDE subtree to numbers.</Paragraph> <Paragraph position="5"> While the WordNet ontology is primarily composed of common nouns, it contains some proper nouns, typically those least likely to be ephemeral (e.g. countries, cities, and famous figures in history). These can be used as any other common nouns are used. Given the question &quot;Which composer wrote 'The Marriage of Figaro'?&quot;, the Word-Net ontology will provide the fact that &quot;Wolfgang Amadeus Mozart&quot; is a composer.</Paragraph> <Paragraph position="6"> Table 1 lists sample questions where a proper noun ontology would be useful. Some of the proper noun types are relatively static (Greek gods, kings of Babylonia). Other categories are more ephemeral (lead singers, British actresses). WordNet enumerates 70 Greek gods and 80 kings, but no lead singers and no British actresses.</Paragraph> <Paragraph position="7"> Ravichandran and Hovy (2002) present an alternative ontology for type preference and describe a method for using this alternative ontology to extract particular answers using surface text patterns. Their proposed ontology is orders of magnitude smaller than WordNet and ontologies considered here, having less than 200 nodes.</Paragraph> <Paragraph position="8"> 3 Building a Proper Noun Ontology In order to better answer the questions in Table 1, we built a proper noun ontology from approximately 1 gigabyte of AP news wire text. To do so, we tok- null enized and part-of-speech tagged the text, and then searched for instances of a common noun followed immediately by a proper noun. This pattern detects phrases of the form '[the] automaker Mercedes Benz', and is ideally suited for proper nouns. In AP news wire text this is a productive and high precision pattern, generating nearly 200,000 unique descriptions, with 113,000 different proper nouns and 20,000 different descriptions. In comparison, the &quot;such as&quot; pattern (Section 5) occurs less than 50,000 times in the same size corpora. Table 2 shows the descriptions generated for a few proper nouns using this simple pattern.</Paragraph> <Paragraph position="9"> To assess the precision of the extractions, we took a sample of 100 patterns extracted from the AP-news text. From these 100, 79 of the items classified as named entities were in fact named entities, and out of those, 60 (75%) had legitimate descriptions.</Paragraph> <Paragraph position="10"> To build the complete ontology, first each description and proper noun forms its own synset. Then, links are added from description to each proper noun it appears with. Further links are put between descriptions &quot;X Y&quot; and &quot;Y&quot; (noun compounds and their heads). Clearly, this method is problematic in the cases of polysemous words or complex noun-noun constructions (&quot;slalom king&quot;) and integrating this ontology with the WordNet ontology requires further study.</Paragraph> <Paragraph position="11"> This proper noun ontology fills many of the holes in WordNet's world knowledge. While WordNet has no lead singer synset, the induced proper noun ontology detects 13 distinct lead singers (Figure 3).</Paragraph> <Paragraph position="12"> WordNet has 2 folk singers; the proper noun ontology has 20. In total, WordNet lists 53 proper nouns as singers, while the induced proper noun ontology has more than 900. While the induced ontology is not complete, it is more complete than what was previously available.</Paragraph> <Paragraph position="13"> As can be seen from the list of descriptions generated by this pattern, people are described in a variety of different ways, and this pattern detects many of them. Table 3 shows the descriptions generated for a common proper noun (&quot;Bill Gates&quot;). When the descriptions are grouped by WordNet synsets and senses manually resolved, the variety of descriptions decreases dramatically (Figure 4). &quot;Bill Gates&quot; can be described by a few distinct roles, and a distribution over these descriptions provide an informative understanding: leader (.48), businessperson (.27), worker (.05), originator (.05), expert (.05), and rich</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> News Corpora </SectionTitle> <Paragraph position="0"> person (.02). Steve Jobs, who has a career path similar to Bill Gates, has a similar but distinct signature: originator (.6), expert (.4).</Paragraph> <Paragraph position="1"> One immediate observation is that some of the descriptions may be more relevant than others. Is Gates' role as an 'office worker' as important as his role as a 'billionaire'? The current system makes no decision and treats all descriptions as equally relevant and stores all of them. There is no need to reject descriptions since there is no human cost in superfluous or distracting descriptions (unlike in summarization tasks). It is important that no invalid descriptions are added.</Paragraph> <Paragraph position="2"> The previous examples have focused on proper nouns which are people's names. However, this method works for many organizations as well, as the data in Table 2 show. However, while description extraction for people is high quality (84% correct descriptions in a 100 example sample), for nonpeople proper names, the quality of extraction is poorer (47% correct descriptions). This is a trend which requires further study.</Paragraph> <Paragraph position="3"> 4 Using a Proper Noun Ontology in a</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Question Answering Task </SectionTitle> <Paragraph position="0"> We generated the above ontology and used it in a sentence comprehension task: given a question and a sentence which answers the question, extract the minimal short answer to the question from the sentence. The task is motivated by the observation that extracting short answers is more difficult than extracting full sentence or passage length ones. Fur- null duced Proper Noun Ontology (IPNO) is combined with Wordnet thermore, retrieving answers from smaller document spaces may be more difficult than retrieving answers from larger ones, if smaller spaces have less redundant coverage of potential answers. In this sentence comprehension task, there is virtually no redundancy. To generate data for this task, we took trivia games, which, along with the question, had a full sentence explanation (Mann, 2002).</Paragraph> <Paragraph position="1"> Baseline experiments used the WordNet ontology alone. From a semantic type preference stated in the question, a word was selected from the sentence as an answer if was a child of the type preference.</Paragraph> <Paragraph position="2"> 'Black' would be picked as an answer for a 'color' type preference (Figure 1).</Paragraph> <Paragraph position="3"> To utilize the induced proper noun ontology, we took the raw data and selected the trailing noun for each proper noun and for each description. Thus, for an extraction of the form &quot;computer mogul Bill Gates&quot;, we added a pattern of the form &quot;Gates mogul&quot;. We created an ontology from these instances completely separate from the WordNet ontology. null We put this induced proper noun ontology into the pipeline as follows: if WordNet failed to find a match, we used the induced proper noun ontology. If that ontology failed to find a match, we ignored the question. In a full system, a named entity recognizer might be added to resolve the other questions.</Paragraph> <Paragraph position="4"> We selected 1000 trivia game questions at random to test out the new two-ontology system. Table 4 shows the results of the experiments. The boost is clear: improved recall at slightly decreased precision. Gains made by inducing an ontology from an unrestricted text corpus (newstext) and applying it to a unmatched test set (trivia games), suggests that a broad-coverage general proper noun ontology may be useful.</Paragraph> <Paragraph position="5"> It is further surprising that this improvement comes at such a small cost. The proper noun ontology wasn't trimmed or filtered. The only disadvantage of this method is simply that its coverage is small. Coverage may be increased by using ever larger corpora. Alternatively, different patterns (for example, appositives) may increase the number of words which have descriptions. A rough error analysis suggests that most of the errors come from mistagging, while few come from correct relationships in the ontology. This suggests that attempts at noise reduction might be able to lead to larger gains in performance. null Another potential method for improving coverage is by bootstrapping descriptions. Our test corpus contained a question whose answer was &quot;Mercedes-Benz&quot;, and whose type preference was &quot;car company&quot;. While our proper noun ontology contained a related link (Mercedes-Benz automaker), it did not contain the exact link (Mercedes-Benz car company). However, elsewhere there existed the links (Opel automaker) and (Opel car company). Potentially these descriptions could be combined to infer (Mercedes-Benz car company). Formally : (B Y) and (A Y) and (A Z) a0 (B Z) (Mercedes-Benz automaker) and (Opel automaker) and (Opel car company) a0 (Mercedes-Benz car company) Expanding descriptions using a technique like this may improve coverage. Still, care must be taken to ensure that proper inferences are made since this rule is not always appropriate. Bill Gates is a tenbillionaire; Steve Jobs isn't.</Paragraph> </Section> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Prior Work in Building Ontologies </SectionTitle> <Paragraph position="0"> There has been considerable work in the past decade on building ontologies from unrestricted text. Hearst (1992) used textual patterns (e.g. &quot;such as&quot;) to identify common class members. Caraballo and Charniak (1999) and Caraballo (1999) augmented these lexical patterns with more general lexical co-occurrence statistics (such as relative entropy). Berland and Charniak (1999) use Hearst style techniques to learn meronym relationships (part-whole) from corpora. There has also been work in building ontologies from structured text, notably in the AQUILEX project (e.g. Copestake, 90) which builds ontologies from machine readable dictionaries.</Paragraph> <Paragraph position="1"> The most closely related work is (Girju, 2001), which describes a method for inducing a domain-specific ontology using some of the techniques described in the previous paragraph. This induced ontology is then potential useful for a matched question domain. Our paper differs in that it targets proper nouns, in particular people, which are overlooked in prior work, have broad applicability, and can be used in a cross-domain fashion. Furthermore, we present initial results which attempt to gauge coverage improvement as a result of the induced ontology. null Another related line of work is word clustering.</Paragraph> <Paragraph position="2"> In these experiments, the attempt is made to cluster similar nouns, without regard to forming a hierarchy. Pereira et al. (1993) presented initial work, clustering nouns using their noun-verb co-occurrence information. Riloff and Lehnert (1993) build semantic lexicons using extraction pattern co-occurrence. Lin and Pantel (2001) extend these methods by using many different types of relations and exploiting corpora of tremendous size.</Paragraph> <Paragraph position="3"> The important difference for this work between the hierarchical methods and the clustering methods is that clusters are unlabelled. The hierarchical methods can identify that a &quot;Jeep Cherokee&quot; is a type of car. In contrast, the clustering methods group together related nouns, but exactly what the connection is may be difficult to distinguish (e.g. the cluster &quot;Sierra Club&quot;, &quot;Environmental Defense Fund&quot;, &quot;Natural Resources Defense Council&quot;, &quot;Public Citizen&quot;, &quot;National Wildlife Federation&quot;). Generating labels for proper noun clusters may be another way to build a proper noun ontology.</Paragraph> <Paragraph position="4"> The method we use to build the fine-grained proper name ontology also resembles some of the work done in coarse-grained named entity recognition. In particular, Collins and Singer (1999) present a sophisticated method for using bootstrapping techniques to learn the coarse-classification for a given proper noun. Riloff and Jones (1999) also present a method to use bootstrapping to create semantic lexicons of proper nouns. These methods may be applicable for use in fine-grained proper noun ontology construction as well.</Paragraph> <Paragraph position="5"> Schiffman et al. (2001) describe work on producing biographical summaries. This work attempts to synthesize one description of a person from multiple mentions. This summary is an end in itself, as opposed to general knowledge collected. These descriptions also attempt to be parsimonious in contrast to the rather free associations extracted by the method presented above.</Paragraph> </Section> class="xml-element"></Paper>