File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1076_intro.xml

Size: 3,111 bytes

Last Modified: 2025-10-06 14:02:23

<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1076">
  <Title>Weakly Supervised Learning for Cross-document Person Name Disambiguation Supported by Information Extraction</Title>
  <Section position="3" start_page="1" end_page="2" type="intro">
    <SectionTitle>
2 Task Definition and Algorithm Design
</SectionTitle>
    <Paragraph position="0"> Given n name mentions, we first introduce the following symbols.</Paragraph>
    <Paragraph position="1"> i C refers to the context of the</Paragraph>
    <Paragraph position="3"> Name refers to the name string of the i -th mention.</Paragraph>
    <Paragraph position="5"> refers to the context similarity between the i -th mention and the j -th mention, which is a subset of the predefined context similarity features.</Paragraph>
    <Paragraph position="6"> a f refers to thea -th predefined context similarity feature. So  The name disambiguation task is defined as hard clustering of the multiple mentions of the same name. Its final solution is represented as {}MK, where K refers to the number of distinct entities, and M represents the many-to-one mapping (from mentions to a cluster) such that () K]. [1,j n],[1,i j,iM [?][?]= One way of combining natural language IE results with traditional co-occurring words is to design a new context representation scheme and then define the context similarity measure based on the new scheme. The challenge to this approach lies in the lack of a proper weighting scheme for these high-dimensional heterogeneous features. In our research, the algorithm directly models the pairwise context similarity.</Paragraph>
    <Paragraph position="7"> For any given context pair, a set of predefined context similarity features are defined. Then with n mentions of a same name,  computed. The name disambiguation task is formulated as searching for {}MK, which maximizes the following conditional probability:  Eq. (1) contains a prior probability distribution of name disambiguation {}()MK,Pr . Because there is no prior knowledge available about what solution is preferred, it is reasonable to take an equal distribution as the prior probability distribution. So the name disambiguation is equivalent to searching for {}MK, which maximizes Expression (2).</Paragraph>
    <Paragraph position="9"> in Eq.</Paragraph>
    <Paragraph position="10"> (3), we use a machine learning scheme which only requires minimal supervision. Within this scheme, maximum entropy modeling is used to combine heterogeneous context features. With the learned conditional probabilities in Eq. (3), for a given {}MK, candidate, we can compute the conditional probability of Expression (2). In the final step, optimization is performed to search for {}MK, that maximizes the value of Expression (2).</Paragraph>
    <Paragraph position="11"> To summarize, there are three key elements in this learning scheme: (i) the use of automatically constructed corpora to estimate conditional probabilities of Eq. (3); (ii) maximum entropy modeling for combining heterogeneous context similarity features; and (iii) statistical annealing for optimization.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML