File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/n04-1003_abstr.xml
Size: 1,425 bytes
Last Modified: 2025-10-06 13:43:24
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-1003"> <Title>Robust Reading: Identification and Tracing of Ambiguous Names</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> A given entity, representing a person, a location or an organization, may be mentioned in text in multiple, ambiguous ways. Understanding natural language requires identifying whether different mentions of a name, within and across documents, represent the same entity.</Paragraph> <Paragraph position="1"> We develop an unsupervised learning approach that is shown to resolve accurately the name identification and tracing problem. At the heart of our approach is a generative model of how documents are generated and how names are &quot;sprinkled&quot; into them. In its most general form, our model assumes: (1) a joint distribution over entities, (2) an &quot;author&quot; model, that assumes that at least one mention of an entity in a document is easily identifiable, and then generates other mentions via (3) an appearance model, governing how mentions are transformed from the &quot;representative&quot; mention. We show how to estimate the model and do inference with it and how this resolves several aspects of the problem from the perspective of applications such as questions answering.</Paragraph> </Section> class="xml-element"></Paper>