File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/p98-2172_evalu.xml

Size: 3,429 bytes

Last Modified: 2025-10-06 14:00:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2172">
  <Title>Reference Resolution beyond Coreference: a Conceptual Frame and its Application Andrei POPESCU-BELIS, Isabelle ROBBA and G6rard SABAH Language and Cognition Group, LIMSI-CNRS</Title>
  <Section position="4" start_page="1050" end_page="1051" type="evalu">
    <SectionTitle>
4 Results and comments
</SectionTitle>
    <Paragraph position="0"> The three heuristics H1, H2, H3 have been tested on our system, while keeping all other numeric parameters constant. The results Table 2 show that on average the heuristic H3 gives here the same results as H1, and is better than H2. As explained above, H2 is clearly too restrictive.</Paragraph>
    <Paragraph position="1"> Different tests have been performed to analyze the system's results. If MR activation isn't used, the scores decrease dramatically, by ca. 50%. When using the H4 heuristic (variable average between H2 and H3) results aren't generally better than those of H3 (except for VA). Compatibility with only one RE of the MR seems thus a good heuristic.</Paragraph>
    <Paragraph position="2">  (for VA, LPG.eq, LPG) This is confirmed when applying the selection constraints on a limited subset of MR.list-of-REs. The worst results are obtained when this set fails to gather the shortest non-pronominal REs of an MR, which shows that these shortest strings (one or several) constitute a sort of 'standard name' for the referent, which suffices to solve the other references. The good score of H1 tends also to confh-m this view.</Paragraph>
    <Paragraph position="3"> An optimization algorithm based on gradient descent has been implemented to tune the activation parameters of the system. Not surprisingly, sometimes the local optimum has no cognitive relevance, as there is no searching heuristic other than recall+precision decrease. A local optimum obtained on one text still leads to good (but not optimal) scores on the other texts. Trained on VA, optimization led to a cumulated 4.3% improvement (precision + recall), and +2.5% on LPG.eq, or in another trial to +5.9%.</Paragraph>
    <Paragraph position="4">  and precision (between 2, left, and 60, right) Finally, the limited size buffer storing the MRs, a cognitively inspired feature, was studied. Variations of the system's performance according to the size of this &amp;quot;working memory&amp;quot; show that it has an optimal size, around 20 MRs (Figure 2). A smaller memory increases recall errors, as important MRs aren't remembered. A larger memory leads to more erroneous attachments (precision errors) because the number of MRs available for attachment overpasses the selection rules' selectiveness. null Conclusion A theoretical model for reference resolution has been presented, as well as an implementation based on the model, which uses only elementary knowledge, available for unrestricted  texts. The model shows altogether greater conceptual accuracy and higher cognitive relevance. Further technical work will seek a better use of the syntactic information; semantic knowledge will be derived in a first approach from a synonym dictionary, awaiting the development of a significant set of canonical conceptual graphs.</Paragraph>
    <Paragraph position="5"> Further conceptual work, besides study of complex plurals, will concern integration of time to mental representations, as well as point of view information.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML