File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/n06-2015_intro.xml

Size: 1,198 bytes

Last Modified: 2025-10-06 14:03:32

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-2015">
  <Title>OntoNotes: The 90% Solution</Title>
  <Section position="3" start_page="0" end_page="57" type="intro">
    <SectionTitle>
2 Treebanking
</SectionTitle>
    <Paragraph position="0"> The Penn Treebank (Marcus et al., 1993) is annotated with information to make predicate-argument structure easy to decode, including function tags and markers of &amp;quot;empty&amp;quot; categories that represent displaced constituents. To expedite later stages of annotation, we have developed a parsing system (Gabbard et al., 2006) that recovers both of these latter annotations, the first we know of. A first-stage parser matches the Collins (2003) parser on which it is based on the Parseval metric, while simultaneously achieving near state-of-the-art performance on recovering function tags (F-measure 89.0). A second stage, a seven stage pipeline of maximum entropy learners and voted perceptrons, achieves state-of-the-art performance (F-measure 74.7) on the recovery of empty categories by combining a linguistically-informed architecture and a rich feature set with the power of modern machine learning methods.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML