File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/w00-1327_concl.xml

Size: 2,775 bytes

Last Modified: 2025-10-06 13:52:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1327">
  <Title>Using Semantically Motivated Estimates to Help Subcategorization Acquisition</Title>
  <Section position="8" start_page="221" end_page="221" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> In this paper, we have shown that the verb form specific SCF distributions of semantically similar verbs correlate well. On the basis of this observation, we have proposed using verb class specific back-off estimates in SCF acquisition. Employing the SCF acquisition framework of Briscoe and Carroll (1997), we have demonstrated that these estimates can be used to improve SCF acquisition significantly, when combined with smoothing and a simple filtering method.</Paragraph>
    <Paragraph position="1"> We have not yet explored the possibility of using the semantically motivated estimates with statistical filtering. In principle, this should help to improve the performance of the statistical methods which make use of back-off estimates. If filtering based on relative frequencies still achieves better results, it would be worth investigating ways of handling the low frequency data for integration with this method. As Korhonen, Gorrell and McCarthy (2000) discuss, any statistical filtering method would work better at low frequences than the one applied, since this simply disregards all low frequency SCFS.</Paragraph>
    <Paragraph position="2"> In addition to refining the filtering method, our future work will focus on integrating this approach with large-scale scF acquisition. This will involve (i) defining the set of semantic verb classes across the lexicon, (ii) obtaining back-off estimates for each verb class, and (iii) implementing a method capable of automatically classifying verbs to semantic classes. The latter can be done by linking the Word-Net synonym sets with semantic classes, using a similar method to that employed by Dorr (1997). With the research reported, verbs were classified to semantic classes according to their most frequent sense. While this approach proved satisfactory, our future work will include investigating ways of addressing the problem of polysemy better.</Paragraph>
    <Paragraph position="3"> The manual effort needed for obtaining the back-off estimates was quite high for this preliminary experiment. However, our recent investigation shows that the total number of semantic classes across the whole lexicon is unlikely to exceed 50. This is because many of the Levin classes have proved similar enough in terms of SCF distributions that they can be combined together. Therefore the additional effort required to carry out the proposed work seems justified, given the accuracy enhancement reported.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML