File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/99/e99-1035_evalu.xml

Size: 2,296 bytes

Last Modified: 2025-10-06 14:00:36

<?xml version="1.0" standalone="yes"?>
<Paper uid="E99-1035">
  <Title>A Cascaded Finite-State Parser for Syntactic Analysis of Swedish</Title>
  <Section position="5" start_page="246" end_page="247" type="evalu">
    <SectionTitle>
4 Evaluation
</SectionTitle>
    <Paragraph position="0"> The performance of the parser partly depends on the output of the tagger and the rest of the pre-processing software. Our way of dealing with how &amp;quot;correct&amp;quot; the performance of the parser is, follows a practical, pragmatic approach, based on consultation of modern Swedish syntax literature. We use the metrics: precision (P), recall (R), F-value (F) and cross-bracketed rate. F = ($2+1) PR/$ 2 P+R, where $ is a parameter encoding the relative importance of (R) and (P); here $=1. Evaluation is performed automatically using the evalb evaluation software, (Sekine &amp; Collins, 1997).</Paragraph>
    <Section position="1" start_page="246" end_page="247" type="sub_section">
      <SectionTitle>
4.1 'Gold Standard' and Error Analysis
</SectionTitle>
      <Paragraph position="0"> For the evaluation of Cass-SWE we use three types of texts: (i) a sample taken from a manually annotated Swedish corpus of 100,000 words with grammatical information (SynTag, JPSrborg, 1990); (ii)-newspaper material; and (iii) a test suite, for non-common constructions, by consulting Swedish syntax literature. Texts (ii) and (iii) were annotated manually. The total number of tokens was 1,500 and sentences 117.</Paragraph>
      <Paragraph position="1"> The evaluation results are given in Table (1), for both noun phrases (NPs), and full chunk parsing (All). The errors found can he divided into: (i)  errors in the texts themselves, which we cannot control and are difficult to discover if the texts are not proofread prior to processing; (ii) errors produced by the tagger; and (iii) grammatical errors produced by the parser, caused mainly by the lack of an appropriate pattern in the rules, and almost exclusively in higher order clauses due to  Proceedings of EACL '99 structural ambiguity and coordination problems. None of the errors in (i) and (ii) have been manually corrected. This was a conscious choice, so that the evaluation of the parsing will be based on unrestricted data.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML