XML Viewer - i05-3016

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/i05-3016_metho.xml
Size: 14,561 bytes
Last Modified: 2025-10-06 14:09:35
<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-3016">
  <Title>Resolving Pronominal References in Chinese with the Hobbs Algorithm</Title>
  <Section position="3" start_page="116" end_page="117" type="metho">
    <SectionTitle>
2 The Corpus and Annotations
</SectionTitle>
    <Paragraph position="0"> The source texts for this study are taken from the first 100K of the CTB 5.0 release of the Penn Chinese Treebank (CTB). The CTB consists of Xinhua news articles that have been segmented, part-of-speech tagged, and bracketed with syntactic labels and functional tags (Xue et al., 2004)1. In the corpus, zero pronouns are denoted using the string &amp;quot;*pro*&amp;quot;. An example is given in Figure 1. In order to provide an answer key or &amp;quot;gold standard&amp;quot; against which to test automatic anaphora resolution methods, we are annotating the CTB to indicate the pronominal coreference relations. All third-person pronouns (including (his, hers, its, theirs) and (he/she/it/they)), reflexives, demonstratives, and *pro* are being annotated.</Paragraph>
    <Paragraph position="1"> Only those coreference relations that are between these anaphors and nominal entities are being co-indexed, however. That is, only NPs that denote the same entity as the entity referred to by the anaphor are being co-indexed. Since not every instance of one of these anaphors necessarily refers to a nominal entity, non-coreferring anaphors are being tagged with labels that categorize them roughly by type.</Paragraph>
    <Paragraph position="2"> The categories are: DD (discourse deictic) for anaphors that refer to propositions or events in the text; EXT (existential) for *pro* in existential contexts analogous to the English &amp;quot;there is/are&amp;quot;; INFR (inferrable) to be put on an anaphor that refers to a specific nominal entity when that entity does not have an overt NP denoting it in the text; AMB (ambiguous) when the interpretation of an anaphor is ambiguous between two (or more) referents; and ARB (arbitrary) for anaphors that don't fall into the other categories2.</Paragraph>
    <Paragraph position="3"> Complex NPs abound in the CTB and present a choice for the placement of the indices and category labels. The decision was made to put the index for a complex NP referent on the entire complex NP rather than on just the head of the phrase  At the same time, there has been a comparatively large increase in the entire country's monthly rent for public housing in cities and townships a0 , with thata0 in a portion of the regions increasing to account for about 10% of the income of dual income families.</Paragraph>
    <Paragraph position="4">  (that is, to annotate &amp;quot;high&amp;quot; in the NP tree). Figure 1 has such a case. The annotation #2 is placed on the parent NP-SBJ level, rather than at the level of the head (NP (NN )(NN )) (monthly rent).</Paragraph>
    <Paragraph position="5"> The reasoning for this choice was that the full NP unambiguously distinguishes between different nominal entities whose NPs have identical head nouns. Head nouns of complex NPs can always be algorithmically obtained.</Paragraph>
  </Section>
  <Section position="4" start_page="117" end_page="118" type="metho">
    <SectionTitle>
3 The Hobbs Algorithm
</SectionTitle>
    <Paragraph position="0"> The &amp;quot;Hobbs Algorithm&amp;quot; was outlined in a paper by Jerry Hobbs in 1978 (Hobbs, 1978). The algorithm is shown in the Appendix. While the algorithm is naive in that the steps proceed merely according to the structure of the parse tree, there are two meta-level points to consider in the execution of the steps. First, the algorithm counts on number and gender agreement. Second, in his paper, Hobbs proposes applying &amp;quot;simple selectional constraints&amp;quot; to the antecedents that the algorithm proposes, and illustrates their use in the sentence he uses to explain the operation of the algorithm: &amp;quot;The castle in Camelot remained the residence of the king until 536 when he moved it to London.&amp;quot; When trying to resolve the pronoun &amp;quot;it&amp;quot; in this sentence, the algorithm would first propose &amp;quot;536&amp;quot; as the antecedent. But dates cannot move, so on selectional grounds it is ruled out. The algorithm continues and next proposes &amp;quot;the castle&amp;quot; as the antecedent. But castles cannot move any more than dates can, so selectional restrictions rule that choice out as well. Finally, &amp;quot;the residence&amp;quot; is proposed, and does not fail the selectional constraints (although one might find that these &amp;quot;simple&amp;quot; constraints require a fair amount of encoded world knowledge).</Paragraph>
    <Paragraph position="1"> In the paper, Hobbs reported the results of testing the algorithm on the pronouns &amp;quot;he&amp;quot;, &amp;quot;she&amp;quot;, &amp;quot;it&amp;quot;3, and &amp;quot;they&amp;quot;, 300 instances in total (100 consecutive pronouns each from three different genres). He found that the algorithm alone worked in 88.3% of the cases, and that the algorithm plus selectional restrictions resolved 91.7% of the cases 3excluding &amp;quot;it&amp;quot; in time or weather constructions, as well as pleonastic and discourse deictic &amp;quot;it&amp;quot; correctly. But of the 300 examples, only 132 actually had more than one &amp;quot;plausible&amp;quot; antecedent nearby. When he tested the algorithm on just those 132 cases, 96 were resolved by the &amp;quot;naive&amp;quot; algorithm alone, a success rate of 72.7%. When selectional restrictions were added the algorithm correctly resolved 12 more, to give 81.8%.</Paragraph>
    <Paragraph position="2"> The Hobbs algorithm was implemented to execute on the CTB. The S label in the CTB is IP, so the two &amp;quot;markable&amp;quot; nodes from the point of view of the algorithm are IP and NP. There were two types of NPs that were excluded, however, NP-TMP and NP-ADV.</Paragraph>
    <Paragraph position="3"> No selectional constraints were applied in this experiment. In addition, no gender or number agreement features were used.</Paragraph>
    <Paragraph position="4"> While the written versions of Chinese third-person pronouns do have number and gender, and demonstratives have number, there is no morphology on verbs to match. Nor, without extrasyntactic lexical features, are there gender markings on nouns or proper names (the titles in this corpus as a rule do not include gender-specific honorifics).</Paragraph>
    <Paragraph position="5"> There is a plural &amp;quot;suffix&amp;quot; ( ) on some nouns denoting human groups, and one can sometimes glean number information from determiner phrases modifying head nouns, but no extra coding was done here to do so.</Paragraph>
    <Paragraph position="6"> Zero pronouns, of course, provide no clues about gender or number, nor do (his, hers, its, theirs) or (he/she/it/they).</Paragraph>
    <Paragraph position="7"> Structurally, there are many sentences in the CTB that consist of just a sequence of parallel independent clauses, separated by commas or semicolons. These multi-clause sentences were treated as single sentences from the point of view of the algorithm.</Paragraph>
    <Paragraph position="8"> The implementation of the algorithm is one that has a core of code that can run on either the Penn Treebank (Marcus et al., 1993) or on the Chinese Treebank. The only differences between the two executables are in the tables for the part-of-speech tags and the syntactic phrase labels (e.g., PN vs.</Paragraph>
    <Paragraph position="9"> PRN for pronouns and IP vs. S for clauses), and in separate NP head-finding routines (not used in the current study).</Paragraph>
    <Paragraph position="10"> Despite the SVO similarity between Chinese and English, we were interested to see if there  might be differences in the success of the algorithm due to structural differences between the languages that might require adapting its steps to Chinese. The most obvious place to look was in the placement of modifiers relative to the head noun in an NP. Although unplanned, it turned out that the policy of annotating complex NPs at the parent level rather than at the head noun level actually made this a moot point because of the top-down nature of the tree traversal. That is, because the algorithm proposes an NP that contains both the modifier and the head, differences between English and Chinese in head-modifier word order does not matter.</Paragraph>
    <Paragraph position="11"> Another place in which the head-modifier ordering might come into play is in Step 6 of the algorithm. This is still under investigation, since the step did not &amp;quot;fire&amp;quot; in the set of files used here, and only proposed an antecedent once when the algorithm was run on the whole CTB.</Paragraph>
  </Section>
  <Section position="5" start_page="118" end_page="118" type="metho">
    <SectionTitle>
4 The Data
</SectionTitle>
    <Paragraph position="0"> As mentioned, in addition to the third person pronouns that Hobbs tested, the algorithm here was run on reflexives, possessives, demonstrative pronouns, and the zero pronoun.</Paragraph>
    <Paragraph position="1"> A sample of 95 files, containing a total of 850 sentences (including headlines, but excluding bylines, and excluding the (End) &amp;quot;sentence&amp;quot; at the end of most articles) was used for this experiment.</Paragraph>
    <Paragraph position="2"> In all there were 479 anaphors in the 95 files.</Paragraph>
    <Paragraph position="3"> The distribution of the anaphors for these files is shown in Table 1.</Paragraph>
    <Paragraph position="4"> Of the anaphors in the gold standard, 331 (69.1%) were co-indexed with antecedents, while 117 (24.4%) did not corefer with entities denoted by NPs and were categorized. The remaining 6.5%, 31 anaphors (two overt and 29 *pro*), were marked ambiguous.</Paragraph>
    <Paragraph position="5"> Of the anaphors that were co-indexed, just over half (53.2%, 176 pronouns) were overt. In contrast, only 24.8% of the categorized pronouns were overt, and these were usually demonstratives labeled #DD.</Paragraph>
  </Section>
  <Section position="6" start_page="118" end_page="118" type="metho">
    <SectionTitle>
5 Results
</SectionTitle>
    <Paragraph position="0"> The performance of the Hobbs algorithm on these data varied considerably depending on the syntactic position of the anaphor, and less so on whether the anaphor was overt or not.</Paragraph>
    <Paragraph position="1"> Performance was analyzed separately for pronouns that appeared as matrix subjects (M), pronouns that appeared as subjects of parallel, independent clauses in multi-clause sentences (M2), and pronouns that were found in any kind of subordinate construction (S).</Paragraph>
    <Paragraph position="2"> The counts for all anaphors are listed in Table 2 and the counts for third-person pronouns only in  only are given in Table 4 and for all coindexed anaphors in Table 54.</Paragraph>
    <Paragraph position="3"> As shown in Table 4, the accuracy for matrixlevel, third-person pronouns was 77.6%, while for all pronouns at the matrix level (Table 5) the algorithm achieved a respectable 76.3% accuracy, considering the fact that not only zero pronouns, but reflexives, possessives, and demonstratives are included.</Paragraph>
    <Paragraph position="4"> This contrasts with 43.2% correct for third-person pronouns in subordinate constructions and 43.3% correct for all subordinate-level pronouns. The accuracy for matrix level (M) and independent clause level (M2) combined was 75.7% for third-person pronouns, and 71.6% for all pronouns. null When results are not broken down by the syntactic position of the anaphor, the performance is less impressive, with just 63.2% accuracy for just third-person pronouns and only 53.2% correct for all anaphors at all syntactic levels.</Paragraph>
    <Paragraph position="5"> The zero anaphors alone showed the same pattern, with 73.3% correct at the matrix level and 66.7% correct for matrix and matrix2 levels combined (Table 6), but just 42.5% accuracy at the subordinate level.</Paragraph>
  </Section>
  <Section position="7" start_page="118" end_page="119" type="metho">
    <SectionTitle>
6 Discussion
</SectionTitle>
    <Paragraph position="0"> The difference in performance of the algorithm by syntactic level clearly suggests that a onemethod-fits-all approach (at least in the case of a rule-based method) to anaphora resolution will not succeed, and that further analysis of anaphors at the non-matrix level is in order.</Paragraph>
    <Paragraph position="1"> 4Of the 31 anaphors marked AMB, in only eight cases (25.8%) did the algorithm pick an antecedent that was one of the choices given by the annotators. All eight were *pro*.  These data are consistent with the observations made by Miltsakaki in her 2002 paper (Miltsakaki, 2002). Taking a main clause and all its dependent clauses as a unit, she found that there were different mechanisms needed to account for (1) topic continuity from unit to unit (intersentential), and (2) focusing preferences within a unit (intra-sentential). Topic continuity was best modeled structurally but the semantics and pragmatics of verbs and connectives were prominent within a unit.</Paragraph>
    <Paragraph position="2"> Since inter-sentential anaphoric links relate to topic continuity, structural rules work best for resolution at the matrix level, while anaphors in subordinate clauses are subject to the semantic/pragmatic constraints of the predicates and connectives.</Paragraph>
    <Paragraph position="3"> In our results the anaphors that are subjects of matrix clauses tend to resolve inter-sententially (that is, Step 4 of the algorithm is the resolving condition), while the anaphors in subordinate constructions are more likely to have intra-sentential antecedents. That the strictly structural version of the Hobbs algorithm used here performed better for matrix-level anaphors and did not do well at all on anaphors in subordinate constructions agrees with Miltsakaki's findings.</Paragraph>
    <Paragraph position="4"> In our data the &amp;quot;unit&amp;quot; is not always a single main clause with its dependent clauses, however.</Paragraph>
    <Paragraph position="5"> In the M2 case, the unit is a sentence containing parallel main clauses, each of which may have its own dependent clauses. An examination of  the errors made for these M2 cases might show that an improvement of performance for these anaphors could be obtained by treating the independent clauses as separate sentences.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML