File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/02/c02-2005_evalu.xml

Size: 1,692 bytes

Last Modified: 2025-10-06 13:58:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-2005">
  <Title>Scaled log likelihood ratios for the detection of abbreviations in text corpora</Title>
  <Section position="5" start_page="10" end_page="11" type="evalu">
    <SectionTitle>
4.2 Results of second experiment
</SectionTitle>
    <Paragraph position="0"> The results of the second experiment are reported in table (18) for the articles from the Wall Street Journal, and in table (19) for the articles from the Neue Zurcher Zeitung. The scaled log l approach generally outperforms the baseline approach. This is reflected in the F measure as well as in the error rate, which is reduced to a third. For one article (WSJ_1) the present approach actually performs below the baseline (cf.</Paragraph>
    <Paragraph position="1"> section 5).</Paragraph>
    <Paragraph position="2">  Manning/Schutze (1999:269) criticize the use of accuracy and error if the number of true negatives -C(&lt;A&gt; - &lt;A&gt;) in the present case - is large. Since the number of true negatives is small here, accuracy and error escape this criticism.</Paragraph>
    <Paragraph position="3">  C(&lt;X&gt; - &lt;Y&gt;) is the number of X which have been wrongly classified as Y. In (16), P stands for the precision, and R for the recall.</Paragraph>
    <Paragraph position="4">  In general, the articles from NZZ contained fewer abbreviations, which is reflected in the comparatively high baseline scores. Still, the present approach is able to outperform the base-line approach. Particularly noteworthy are the articles NZZ_1, NZZ_4, and NZZ_8, where the error rate is reduced to 0. In general, the error rate has been reduced to a fifth.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML