File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-1021_metho.xml

Size: 4,989 bytes

Last Modified: 2025-10-06 14:14:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-1021">
  <Title>Anaphora for Everyone: Pronominal Anaphora Resolution without a Parser</Title>
  <Section position="5" start_page="116" end_page="117" type="metho">
    <SectionTitle>
4 Evaluation
</SectionTitle>
    <Paragraph position="0"> Quantitative evaluation shows the anaphora resolution algorithm described here to run at a rate of 75'70 accuracy. The data set on which the evaluatkm was based consisted of 27 texts, taken from a random selection of genres, including press releases, product annotmcemeats, news stories, magazine articles, and other documents existing as World Wide Web pages. Within these texts, we counted 3(16 third person anaphoric pronouns; of these, 231l were correctly resolved to the discourse referent identified as the antecedent by the first author. 3 This rate of accuracy is clearly comparable to that of the Lappin/Leass algorithm, which (Lappin and Leass, \] 994) report as 85deg/,,.</Paragraph>
    <Paragraph position="1"> Several observations about the results and the comparison with (lmppin and I,eass, 1994) are in order.</Paragraph>
    <Paragraph position="2"> First, and most obviously, some deterioratkm in quality is to be expected, given the relatively impoverished linguistic base we start with.</Paragraph>
    <Paragraph position="3"> Second, it is important to note that this is not just a matter of simple comparison. The results in (l.appin and Leass, 1994) describe the output of the procedttre applied to a singh,' text genre: computer manuals. Arguably, this is an example of a particularly well behaved text; in any case, it is not clear how the figure would be normalized over a wide range of text types, some of them not completely 'clean', as is the case with our data.</Paragraph>
    <Paragraph position="4"> Third, close analysis of the most common types of error our algorithm currently makes reveals two specific configurations in the input which confuse the procedure and contribute to the error rate: gender mis-match (35% of errors) and certain long range contextttal (stylistic) phenomena, best exemplified by text containing quoted passages in-line (14% of errors).</Paragraph>
    <Paragraph position="5"> Implementing a gender (dis-)agreement filter is not technically complex; as noted above, the current algorithrn contains one. The persistence of gender mismatches in the output simply reflects the lack of a consistent gender slot in the I,\[NGSOFT tagger output. Augmenting the algorithm with a lexical database which includes more detailed gender information will result in improved accuracy.</Paragraph>
    <Paragraph position="6"> Ensuring proper interpretatkm of anaphors both within and outside of quoted text requires, in effect, a method of evaluating quoted speech separately from its surrotmdingcnntext. Although a complex problem, we feel that this is possible, given that our input data stream embodies a richer notkm of position and context, as a resu\[t of an independent text segmentation procedure adapted from (\[ learst, 1994) (and discussed above in section 2.2.2).</Paragraph>
    <Paragraph position="7"> What is worth noting is the small number of errors which can be directly attributed to the absence of configurational inh~rmation. Of the 75 misinterpreted pronouns, only 2 inw~lved a failure to establish configuratkmally determined disjoint reference (both of these inw~lved Condition 3), and only an additional several errors could be tmambiguously traced to a failure to correctly identify the syntactic context in which a dis~ course referent appeared (as determined by a misfireof the salience factors sensitive to syntactic context, I lEAD-S and ARC:S).</Paragraph>
    <Paragraph position="8"> Overall, these considerations lead to two conchl-.</Paragraph>
    <Paragraph position="9"> sions. First, with the incorporation of more explicit morphological and contextual information, it should 3The set of 306 &amp;quot;anaphoric&amp;quot; pronouns excluded 30 occurrences of &amp;quot;expletive&amp;quot; it not identified by the expletive patterns (prhnarily occurrences in object position), as well as 6 occurrences of it which referred to a VP or propositional constituent. We are currently mfinin g the existing expletive patterns for improved accuracy.</Paragraph>
    <Paragraph position="10">  be possible to increase the overall quality of our output, bringing it much closer in line with Lappin and Leass' results. Again, straight comparison would not be trivial, as e.g. quoted text passages are not a natural part of computer manuals, and are, on the other hand, an extremely common occurrence in the types of text we are dealing with.</Paragraph>
    <Paragraph position="11"> Second, and most importantly, the absence of explicit configurational information does not result in a substantial degradation in the accuracy of an anaphora resolution algorithm that is otherwise similar to that described in (Lappin and Leass, 1994).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML