File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/05/p05-1038_abstr.xml

Size: 1,040 bytes

Last Modified: 2025-10-06 13:44:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="P05-1038">
  <Title>Lexicalization in Crosslinguistic Probabilistic Parsing: The Case of French</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> This paper presents the first probabilistic parsing results for French, using the recently released French Treebank. We start with an unlexicalized PCFG as a base-line model, which is enriched to the level of Collins' Model 2 by adding lexicalization and subcategorization. The lexicalized sister-head model and a bigram model are also tested, to deal with the flatness of the French Treebank. The bigram model achieves the best performance: 81% constituency F-score and 84% dependency accuracy. All lexicalized models outperform the unlexicalized baseline, consistent with probabilistic parsing results for English, but contrary to results for German, where lexicalization has only a limited effect on parsing performance.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML