File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-3006_concl.xml

Size: 2,587 bytes

Last Modified: 2025-10-06 13:53:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-3006">
  <Title>A low-complexity, broad-coverage probabilistic Dependency Parser for English</Title>
  <Section position="7" start_page="0" end_page="0" type="concl">
    <SectionTitle>
5 Preliminary Evaluation
</SectionTitle>
    <Paragraph position="0"> The probabilistic language models have been trained on section 2 to 24 and the parser tested on section 0. The  held out training data and the first-ranked reading for each sentence of section 0 are compared for evaluation (Lin, 1995). Parsing the 46527 words of section 0 takes 30 minutes on a 800 MHz Pentium 3 PC, including about 3 minutes for tagging and chunking. Current precision and recall values for subject, object and PP-attachment relations, and for the disambiguation between prepositions and complements are in table 1.</Paragraph>
    <Paragraph position="1"> These results, slightly lower than state-of-the-art ((Lin, 1998), (Preiss, 2003)), are least merit figures or a proof of concept rather than accurate figures. On the one hand, the performance of the parser suffers from mistaggings and mischunkings or a limited grammar, the price for the speed increase. On the other hand, different grammatical assumptions both between the Treebank and the chunker, and between the Treebank and functional dependency, seriously affect the evaluation. For example, the chunker often recognizes units longer than base-NPs like [many of the people], or smaller or longer than verbal groups [has] for a long time [been], [likely to bring] - correct chunks which are currently considered as errors.</Paragraph>
    <Paragraph position="2"> In addition, it is very difficult to avoid tgrep overgenerating or missing. It turns out that the mapping is accurate enough for a statistical model but not for a reliable evaluation. Some possible configurations are missed by the current extraction queries. For example, extraposed PPs such as the one starting this sentence, have escaped unmapped until now. For the future, the use of a standardized DG test suite is envisaged (Carroll et al., 1999). The grammar explicitly excludes a number of grammatical phenomena which cannot currently be treated reliably. For example, since no PP-interaction model such as PCFG rules for NP-attached PPs exists yet, the current grammar does not allow a NP to take several PPs, which affects the analysis of relational nouns. The statistical models, the dependency extraction, the grammar, the tagger and chunker approach and the evaluation method will continue to be improved.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML