File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-0405_intro.xml

Size: 2,528 bytes

Last Modified: 2025-10-06 14:01:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0405">
  <Title>Unsupervised Personal Name Disambiguation</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> One open problem in natural language ambiguity resolution is the task of proper noun disambiguation1. While word senses and translation ambiguities may typically have 2-20 alternative meanings that must be resolved through context, a personal name such as &amp;quot;Jim Clark&amp;quot; may potentially refer to hundreds or thousands of distinct individuals. Each different referent typically has some distinct contextual characteristics. These characteristics can help distinguish, resolve and trace the referents when the surface names appear in online documents.</Paragraph>
    <Paragraph position="1"> A search of Google shows 76,000 web pages mentioning Jim Clark, of which the first 10 unique referents are: 1This has been recognized even by the popular press.</Paragraph>
    <Paragraph position="2"> Reuters (March 13, 2003) observed the problem of name ambiguity to be a major stumbling block in personal name web searches.</Paragraph>
    <Paragraph position="3">  1. Jim Clark - Race car driver from Scotland 2. Jim Clark - Clockmaker from Colorado 3. Jim Clark - Film Editor 4. Jim Clark - Netscape Founder 5. Jim Clark - Disaster Survivor 6. Jim Clark - Car Salesman in Kansas 7. Jim Clark - Fishing Instructor in Canada 8. Jim Clark - Computer Science student in Hong Kong 9. Jim Clark - Professor at McGill 10. Jim Clark - Gun Dealer in Louisiana  In this paper, we present a method for distinguishing the real world referent of a given name in context. Approaches to this problem include Wacholder et al. (1997), focusing on the variation of surface name for a given referent, and Smith and Crane (2002), resolving geographic name ambiguity. We present preliminary evaluation on pseudonames: conflations of multiple personal names, constructed in the same way pseudowords are used for word sense disambiguation (Gale et al., 1992). We then present corroborating evidence from real personal name polysemy to show that this technique works in practice.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Miles Davis
</SectionTitle>
      <Paragraph position="0"> birth day May 26(5), May 25(5) birth year 1926(82), 1967(18), 1969(9)... occupation trumpeter(38), artist(10), player(5)...</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML