File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/w05-1528_concl.xml

Size: 3,368 bytes

Last Modified: 2025-10-06 13:55:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-1528">
  <Title>k-NN for Local Probability Estimation in Generative Parsing Models</Title>
  <Section position="5" start_page="0" end_page="202" type="concl">
    <SectionTitle>
3 Experiments
</SectionTitle>
    <Paragraph position="0"> Our model is trained on sections 2 to 21 inclusive of the Penn WSJ treebank and tested on section 23.</Paragraph>
    <Paragraph position="1"> We used sections 0, 1, 22 and 24 for validation.</Paragraph>
    <Paragraph position="2"> We re-estimated the probability of each parse using our own baseline model, which is a replication of Collins Model 1. We tested k-NN estimation on the head generation parameter class  and the parameter classes for generating modifying nonterminals. We further decomposed the two modifying nonterminal parameter classes. Table 1 outlines the parameter classes estimated using k-NN in the final model settings and shows the feature sets used for each parameter class as well as the constraint feature settings.</Paragraph>
    <Paragraph position="4"> the final model. CH is the head child label, Cp the parent constituent label, wp the head word, tp the head part-of-speech (POS) tag. Ci, wi and ti are the modifier's label, head word and head POS tag. tgp is the grand-parent POS tag, Cgp, Cggp, Cgggp are the labels of the grandparent, great-grandparent and great-great-grandparent nodes. dir is a flag which indicates whether the modifier being generated is to the left or the right of the head child. dist is the distance metric used in the Collins parser. coord, punc are the coordination and punctuation flags. NPB stands for base noun phrase.</Paragraph>
    <Paragraph position="5"> We extend the original feature sets by increasing the order of both horizontal and vertical markovization. From each constituent node in the vertical or horizontal history we chose features from among the constituent's nonterminal label, its head word and the head word's part-of-speech tag.</Paragraph>
    <Paragraph position="6"> We found for all parameter classes 000,10k or 000,20k worked best. Distance weighting function that worked best were the inverse distance  40 words, from section 23 of the Penn treebank. LP/LR =Labelled Precision/Recall. CO99 M1 and M2 are (Collins 1999) Models 1 and 2 respectively. Bikel 1-best is (Bikel, 2004). k-NN is our final k-NN model.</Paragraph>
    <Paragraph position="7"> With our k-NN model we achieve LR/LR of 89.1%/89.4% on sentences 40 words. These results show an 8% relative reduction in f-score error over our Model 1 baseline and a 4% relative reduction in f-score error over the Bikel parser.</Paragraph>
    <Paragraph position="8"> We compared the results of our k-NN model against the Bikel 1-best parser results using the paired T test where the data points being compared were the scores of each parse in the two different sets of parses. The 95% confidence interval for the mean difference between the scores of the paired sets of parses is [0.029, 0.159] with P&lt; .005.</Paragraph>
    <Paragraph position="9"> Following (Collins 2000) the score of a parse takes into account the number of constituents in the gold standard parse for this sentence. These results show that using the methods presented in this paper can produce significant improvements in parser accuracy over the baseline parser.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML