File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-1306_intro.xml

Size: 5,885 bytes

Last Modified: 2025-10-06 14:06:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1306">
  <Title>High Precision Coreference CogNIAC : with Limited Knowledge Resources and Linguistic</Title>
  <Section position="3" start_page="0" end_page="38" type="intro">
    <SectionTitle>
they VB NN
Mariana VBD PP Sarah TO VB herself PP DT AJD
NN
</SectionTitle>
    <Paragraph position="0"> Without lexical knowledge a human attempting to resolve the pronouns is in much the knowledge impoverished position of the typical coreference  algorithm. It is no surprise that texts with so little information provided in them tend to be more ambiguous than the texts in fleshed out form. The conclusion to draw from this example is that the limiting factor in CogNIAC is knowledge sources, not an artificial restriction on domains or kinds of coreference. This point will be resumed in the discussion section when what the consequences of fuller knowledge sources would be on CogNIAC.</Paragraph>
    <Section position="1" start_page="38" end_page="38" type="sub_section">
      <SectionTitle>
2.1 Using limited world knowledge
</SectionTitle>
      <Paragraph position="0"> to find possible antecedents: For noun phrase anaphora, gathering semantically possible antecedents amounts to running all the noun phrases in a text through various databases for number and gender, and perhaps then a classifier that determines whether a noun phrase is a company, person or place 1.</Paragraph>
      <Paragraph position="1"> This set of candidate antecedents rarely has more than 5 members when some reasonable locality constraints are adhered to, and this set almost always contains the actual antecedent. The remainder of the coreference resolution process amounts to picking the right entity from this set.</Paragraph>
      <Paragraph position="2"> For the kinds of data considered here (narratives and newspaper articles) there is a rarely a need for general world knowledge in assembling the initial set of possible antecedents for pronouns. This does not address the issue of inferred antecedents, event reference, discourse deixis and many other sorts of referring phenomenon which clearly requi~e the use of world knowledge but are beyond the scope of this work. As it happens, recognizing the possible antecedents of these pronouns is within the capabilities of current knowledge sources.</Paragraph>
      <Paragraph position="3"> Better knowledge sources could be used to reduce the space of possible antecedents. For example the well known \[Winograd 1972\] alternation: The city council refused to give the women a permit because they {feared/advocated} violence.</Paragraph>
      <Paragraph position="4"> There are two semantically possible antecedents to they: The city council, and the women. The problem is picking the correct one. Dependent on verb choice, they strongly prefers one antecedent to the other. Capturing this generalization requires a sophisticated theory of verb meaning as relates to pronoun resolution.</Paragraph>
      <Paragraph position="5"> Speaking anecdotally, these kinds of resolutions happen quite often in text. CogNIAC recognizes knowledge intensive coreference and does not attempt to resolve such instances.</Paragraph>
    </Section>
    <Section position="2" start_page="38" end_page="38" type="sub_section">
      <SectionTitle>
2.2 Using limited linguistic
</SectionTitle>
      <Paragraph position="0"> resources to find coreference: 1 The named entity task at MUC-6 used a similar classification task and the best system performance was 96% precision/97% recall.</Paragraph>
      <Paragraph position="1"> Fortunately not all instances of pronominal anaphora require world knowledge for successful resolution. In lieu of full world knowledge, CogNIAC uses regularities of English usage in an attempt to mimic strategies used by humans when resolving pronouns. For example, the syntax of a sentence highly constrains a reflexive pronoun's antecedent. Also if there is just one possible antecedent in entire the prior discourse, then that entity is nearly always the correct antecedent. CogNIAC consists of a set of such observations implemented in Perl.</Paragraph>
      <Paragraph position="2"> CogNIAC has been used with a range of linguistic resources, ranging from scenarios where almost no linguistic processing of the text is done at all to partial parse trees being provided. At the very least, there must be sufficient linguistic resources to recognize pronouns in the text and the space of candidate antecedents must be identified. For the first experiment the text has been part of speech tagged and basal noun phrases have been identified with '\[\]' (i.e. noun phrases that have no nested noun phrases) as shown below:</Paragraph>
      <Paragraph position="4"> In addition, finite clauses were identified (by hand for experiment 1) and various regular expressions are used to identify subjects, objects and what verbs take as arguments for the purposes of coreference restrictions.</Paragraph>
      <Paragraph position="5"> With this level of linguistic annotation, nearly all the parts of CogNIAC can be used to resolve pronouns.</Paragraph>
      <Paragraph position="6"> The core rules of CogNIAC are given below, with their performance on training data provided (200 pronouns of narrative text). In addition, examples where the rules successfully apply have been provided for most of the rules with the relevant anaphors and antecedents in boldface. The term 'possible antecedents' refers to the set of entities from the discourse that are compatible with an anaphor's gender, number and coreference restrictions (i.e. non-reflexive pronouns cannot corefer with the other arguments of its verb/preposition etc.).</Paragraph>
      <Paragraph position="7"> 1) Unique in Discourse: If there is a single possible antecedent i in the read-in portion of the entire discourse, then pick i as the antecedent: 8 correct, and 0 incorrect.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML