File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/05/p05-1038_abstr.xml
Size: 1,040 bytes
Last Modified: 2025-10-06 13:44:26
<?xml version="1.0" standalone="yes"?> <Paper uid="P05-1038"> <Title>Lexicalization in Crosslinguistic Probabilistic Parsing: The Case of French</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper presents the first probabilistic parsing results for French, using the recently released French Treebank. We start with an unlexicalized PCFG as a base-line model, which is enriched to the level of Collins' Model 2 by adding lexicalization and subcategorization. The lexicalized sister-head model and a bigram model are also tested, to deal with the flatness of the French Treebank. The bigram model achieves the best performance: 81% constituency F-score and 84% dependency accuracy. All lexicalized models outperform the unlexicalized baseline, consistent with probabilistic parsing results for English, but contrary to results for German, where lexicalization has only a limited effect on parsing performance.</Paragraph> </Section> class="xml-element"></Paper>