File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1053_intro.xml
Size: 3,259 bytes
Last Modified: 2025-10-06 14:02:23
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-1053"> <Title>Discovering Relations among Named Entities from Large Corpora</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Prior Work </SectionTitle> <Paragraph position="0"> The concept of relation extraction was introduced as part of the Template Element Task, one of the information extraction tasks in the Sixth Message Understanding Conference (MUC-6) (Defense Advanced Research Projects Agency, 1995). MUC-7 added a Template Relation Task, with three relations. Following MUC, the Automatic Content Extraction (ACE) meetings (National Institute of Standards and Technology, 2000) are pursuing informa1GPE is an acronym introduced by the ACE program to represent a Geo-Political Entity -- an entity with land and a government. null tion extraction. In the ACE Program2, Relation Detection and Characterization (RDC) was introduced as a task in 2002. Most of approaches to the ACE RDC task involved supervised learning such as kernel methods (Zelenko et al., 2002) and need richly annotated corpora which are tagged with relation instances. The biggest problem with this approach is that it takes a great deal of time and effort to prepare annotated corpora large enough to apply supervised learning. In addition, the varieties of relations were limited to those defined by the ACE RDC task. In order to discover knowledge from diverse corpora, a broader range of relations would be necessary.</Paragraph> <Paragraph position="1"> Some previous work adopted a weakly supervised learning approach. This approach has the advantage of not needing large tagged corpora. Brin proposed the bootstrapping method for relation discovery (Brin, 1998). Brin's method acquired patterns and examples by bootstrapping from a small initial set of seeds for a particular relation. Brin used a few samples of book titles and authors, collected common patterns from context including the samples and finally found new examples of book title and authors whose context matched the common patterns. Agichtein improved Brin's method by adopting the constraint of using a named entity tagger (Agichtein and Gravano, 2000). Ravichandran also explored a similar method for question answering (Ravichandran and Hovy, 2002). These approaches, however, need a small set of initial seeds.</Paragraph> <Paragraph position="2"> It is also unclear how initial seeds should be selected and how many seeds are required. Also their methods were only tried on functional relations, and this was an important constraint on their bootstrapping.</Paragraph> <Paragraph position="3"> The variety of expressions conveying the same relation can be considered an example of paraphrases, and so some of the prior work on paraphrase acquisition is pertinent to relation discovery. Lin proposed another weakly supervised approach for discovering paraphrase (Lin and Pantel, 2001). Firstly Lin focused on verb phrases and their fillers as sub-ject or object. Lin's idea was that two verb phrases which have similar fillers might be regarded as paraphrases. This approach, however, also needs a sample verb phrase as an initial seed in order to find similar verb phrases.</Paragraph> </Section> class="xml-element"></Paper>