File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/p97-1003_intro.xml

Size: 2,648 bytes

Last Modified: 2025-10-06 14:06:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="P97-1003">
  <Title>Three Generative, Lexicalised Models for Statistical Parsing</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Generative models of syntax have been central in linguistics since they were introduced in (Chomsky 57). Each sentence-tree pair (S,T) in a language has an associated top-down derivation consisting of a sequence of rule applications of a grammar. These models can be extended to be statistical by defining probability distributions at points of non-determinism in the derivations, thereby assigning a probability 7)(S, T) to each (S, T) pair. Probabilistic context free grammar (Booth and Thompson 73) was an early example of a statistical grammar.</Paragraph>
    <Paragraph position="1"> A PCFG can be lexicalised by associating a head-word with each non-terminal in a parse tree; thus far, (Magerman 95; Jelinek et al. 94) and (Collins 96), which both make heavy use of lexical information, have reported the best statistical parsing performance on Wall Street Journal text. Neither of these models is generative, instead they both estimate 7)(T\] S) directly.</Paragraph>
    <Paragraph position="2"> This paper proposes three new parsing models.</Paragraph>
    <Paragraph position="3"> Model 1 is essentially a generative version of the model described in (Collins 96). In Model 2, we extend the parser to make the complement/adjunct distinction by adding probabilities over subcategorisation frames for head-words. In Model 3 we give a probabilistic treatment of wh-movement, which This research was supported by ARPA Grant N6600194-C6043.</Paragraph>
    <Paragraph position="4"> is derived from the analysis given in Generalized Phrase Structure Grammar (Gazdar et al. 95). The work makes two advances over previous models: First, Model 1 performs significantly better than (Collins 96), and Models 2 and 3 give further improvements -- our final results are 88.1/87.5% constituent precision/recall, an average improvement of 2.3% over (Collins 96). Second, the parsers in (Collins 96) and (Magerman 95; Jelinek et al.</Paragraph>
    <Paragraph position="5"> 94) produce trees without information about wh-movement or subcategorisation. Most NLP applications will need this information to extract predicate-argument structure from parse trees.</Paragraph>
    <Paragraph position="6"> In the remainder of this paper we describe the 3 models in section 2, discuss practical issues in section 3, give results in section 4, and give conclusions in section 5.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML