File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/02/c02-1075_evalu.xml

Size: 2,522 bytes

Last Modified: 2025-10-06 13:58:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1075">
  <Title>A Novel Disambiguation Method For Unification-Based Grammars Using Probabilistic Context-Free Approximations</Title>
  <Section position="6" start_page="89" end_page="89" type="evalu">
    <SectionTitle>
5 Evaluation
</SectionTitle>
    <Paragraph position="0"> Evaluation task. To evaluate our models, we used the testing corpus mentioned in section 4. In a next step, the correct parse was indicated by a human disambiguator, according to the intended reading. The average ambiguity of this corpus is about 1.4 parses per sentence, for sentences with about 5.8 words on average.</Paragraph>
    <Paragraph position="1"> Our statistical disambiguation method was tested on an exact match task, where exact correspondence of the manually annotated correct parse and the most probable parse is checked. Performance on this evaluation task was assessed according to the following evaluation measure:  training iterations for probabilistic context-free approximations, starting with uniform and random probabilities for the grammar rules. Baseline is the disambiguation accuracy of the symbolic approximated UBG.</Paragraph>
    <Paragraph position="2"> where &amp;quot;correct&amp;quot; and &amp;quot;incorrect&amp;quot; specifies a success or failure on the evaluation tasks, resp.</Paragraph>
    <Paragraph position="3"> Evaluation results. First, we calculated a random baseline by randomly selecting a parse for each sentence of the test corpus. This baseline measures the disambiguation power of the pure symbolic parser and was around 72% precision.</Paragraph>
    <Paragraph position="4"> Optimal iteration numbers were decided by repeated evaluation of the models at every iteration step. Fig. 6 shows the precision of the models on the exact match task plotted against the number of iterations of the training algorithm. The baseline represents the disambiguation accuracy of the symbolic approximated UBG which is clearly outperformed by inside-outside estimation, starting with uniform or random probabilities for the rules of the CF approximation. A clear overtraining effect occurs for both cases (see iterations a0 a5 and a0 a1 a19 , resp.). A comparison of the models with our random baseline shows an increase in precision of about 16%. Although we tried hard to improve this gain by varying the starting parameters, we wish to report that we found no better starting parameters than uniform probabilities for the grammar rules.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML