File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/w00-1325_concl.xml
Size: 2,981 bytes
Last Modified: 2025-10-06 13:52:55
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1325"> <Title>Statistical Filtering and Subcategorization Frame Acquisition</Title> <Section position="6" start_page="203" end_page="204" type="concl"> <SectionTitle> 4 Conclusion </SectionTitle> <Paragraph position="0"> This paper explored three possibilities for filtering out the SCF entries produced by a SCF acquisition system. These were (i) a version of Brent's binomial filter, commonly used for this purpose, (ii) the binomial log-likelihood ratio test, recommended for use with low frequency data and (iii) a simple method using a threshold on the MLEs of the SCFS output from the system. Surprisingly, the simple MLE thresholding method worked best. The BHT and LLR both produced an astounding mlmber of FPs, particularly at low frequencies. Further work on handling low frequency data in SCF acquisition is warranted. A non-parametric statistical test, such as Fisher's exact test, recommended by Pedersen (1996), might improve on the results obtained using parametric tests. However, it seems from our experiments that it would be better to avoid hypothesis tests that make use of the unconditional distribution.</Paragraph> <Paragraph position="1"> One possibility is to put more effort into the estimation of pe, and to avoid use of the unconditional distribution for this. In some recent experiments, we tried optimising the estimates for pe depending on the performance of the system for the target SCF, using the method proposed by Briscoe, Carroll and Korhonen (1997). The estimates of pe were obtained from a training set separate to the held-out BNC data used for testing. Results using the new estimates for pe gave an improvement of 10% precision and 6% recall, compared to the BHT results reported here. Nevertheless, the precision result was 14% worse for precision than MLE, though there was a 4% improvement in recall, making the overall performance 3.9 worse than MLE according to the F measure. Lapata (1999) also reported that a simple relative frequency cut off produced slightly better results than a Brent style BHT.</Paragraph> <Paragraph position="2"> If MLE thresholding persistently achieves better results, it would be worth investigating ways of handling the low frequency data, such as smoothing, for integration with this method. However, more sophisticated smoothing methods, which back-off to an un-Conditional distribution, will also suffer from the lack of correlation between conditional and unconditional SCF distributions. Any statistical test would work better at low frequencies than the MLE, since this simply disregards all low frequency SCFs. In our experiments, ff we had used MLE only for the high frequency data, and BHT for medium and low, then over-all we would have had 54% precision and 67% recall. It certainly seems worth employing hypothesis tests which do not rely on the unconditional distribution for the low frequency SCFS.</Paragraph> </Section> class="xml-element"></Paper>