File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/94/p94-1026_evalu.xml
Size: 2,533 bytes
Last Modified: 2025-10-06 14:00:16
<?xml version="1.0" standalone="yes"?> <Paper uid="P94-1026"> <Title>GRAMMAR SPECIALIZATION THROUGH ENTROPY THRESHOLDS</Title> <Section position="8" start_page="193" end_page="193" type="evalu"> <SectionTitle> EXPERIMENTAL RESULTS </SectionTitle> <Paragraph position="0"> A module realizing this scheme has been implemented and applied to the very setup used for the previous experiments with the hand-coded tree-cutting criteria, see \[Samuelsson 1994a\]. 2100 of the verified parse trees constituted the training set, while 230 of them were used for the test set. The table below summarizes the results for some grammars of different coverage extracted using: 1. Hand-coded tree-cutting criteria.</Paragraph> <Paragraph position="1"> 2. Induced tree-cutting criteria where the node entropy was taken to be the phrase entropy of the RHIS phrase of the dominating grammar rule.</Paragraph> <Paragraph position="2"> 3. Induced tree-cutting criteria where the node entropy was the sum of the phrase entropy of the RHS phrase of the dominating grammar rule and the weighted sum of the phrase entropies of the LHSs of the alternative choices of grammar rules to resolve on. In the latter two cases experiments were carried out both with and without the restrictions on neighbouring cutnodes discussed in the previous section.</Paragraph> <Paragraph position="3"> With the mixed entropy scheme it seems important to include the restrictions on neighbouring cutnodes, while this does not seem to be the case with the RHS phrase entropy scheme. A potential explanation for the significantly higher average parsing times for all grammars extracted using the induced tree-cutting criteria is that these are in general recursive, while the hand-coded criteria do not allow recursion, and thus only produce grammars that generate finite languages.</Paragraph> <Paragraph position="4"> Although the hand-coded tree-cutting criteria are substantially better than the induced ones, we must remember that the former produce a grammar that in median allows 60 times faster processing than the original grammar and parser do. This means that even if the induced criteria produce grammars that are a factor two or three slower than this, they are still approximately one and a half order of magnitude faster than the original setup. Also, this is by no means a closed research issue, but merely a first attempt to realize the scheme, and there is no doubt in my mind that it can be improved on most substantially.</Paragraph> </Section> class="xml-element"></Paper>