File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/00/w00-1327_evalu.xml
Size: 3,894 bytes
Last Modified: 2025-10-06 13:58:40
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1327"> <Title>Using Semantically Motivated Estimates to Help Subcategorization Acquisition</Title> <Section position="7" start_page="219" end_page="221" type="evalu"> <SectionTitle> 4.2 Results </SectionTitle> <Paragraph position="0"> Table 5 gives average results for the 60 test verbs using each method. The results indicate that both add one smoothing and Katz backing-off improve the baseline performance only slightly. Linear interpolation outperforms these methods, achieving better results on all measures. The improved KL indicates that the method improves the overall accuracy of SCF distributions. The results with rtc and system accuracy show that it helps to correct the ranking of SCFs. The fact that both precision and recall show clear improvement over the baseline results demonstrates that linear interpolation can be successfully combined with the filtering method employed. These results seem to suggest that a smoothing method which affects both the highly ranked SCFs and SCFs of low frequency is profitable for this task.</Paragraph> <Paragraph position="1"> In this experiment, the semantically motivated back-off estimates helped to reduce the sparse data problem significantly. While a total of 151 correct SCFs were missing in the test data, only three were missing after smoothing with Katz backing-off or linear interpolation.</Paragraph> <Paragraph position="2"> For comparison, we re-run these experiments using the general SCF distribution of all verbs as back-off estimates for smoothing 6. The average results for the 60 test verbs given in table 6 show that when using these estimates, we obtain worse results than with the baseline method. This demonstrates that while such estimates provide an easy solution to the sparse data problem, they can actually degrade the accuracy of verbal acquisition.</Paragraph> <Paragraph position="3"> Table 7 displays individual results for the different verb classes. It lists the results obtained with KL and Rc using the baseline method and linear interpolation with semantically motivated estimates. Examining the results obtained with linear interpolation allows us to consider the accuracy of the back-off es6These estimates were obtained by extracting the number of verbs which are members of each SCF class in the ANLT dictionary. See section 2 for details. timates for each verb class. Out of ten verb classes, eight show improvement with linear interpolation, with both KL and Rc. However, two verb classes - aspectual verbs, and verbs of appearance, disappearance and occurrence - show worse results when linear interpolation is used.</Paragraph> <Paragraph position="4"> According to Levin (1993), these two verb classes need further classification before a full semantic account can be made. The problem with aspectual verbs is that the class contains verbs taking sentential complements.</Paragraph> <Paragraph position="5"> As Levin does not classify verbs on basis of their sentential complement-taking properties, more classification work is required before we can obtain accurate SCF estimates for this type of verb.</Paragraph> <Paragraph position="6"> The problem with verbs of appearance is more specific to the verb class. Levin remarks that the definition of appearance verbs may be too loose. In addition, there are significant syntactic differences between the verbs belonging to the different sub-classes.</Paragraph> <Paragraph position="7"> This suggests that we should examine the degree of SCF correlation between verbs from different sub-classes before deciding on the final (sub-)class for which we obtain the estimates. As the results with the combined Levin classes show, estimates can also be successfully built using verbs fromdifferent Levin classes, provided that the classes are similar enough.</Paragraph> </Section> class="xml-element"></Paper>