File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-1013_concl.xml

Size: 1,657 bytes

Last Modified: 2025-10-06 13:53:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-1013">
  <Title>A Categorial Variation Database for English</Title>
  <Section position="7" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6 Conclusions and Future Work
</SectionTitle>
    <Paragraph position="0"> We have presented our approach to constructing and evaluating a new large-scale database containing categorial variations of English words. In addition, we have described different applications for which it has proven useful. Our evaluation indicates that CatVar has coverage and accuracy of over 80% (F-score) and also that the database improves the linkability of Porter stemmer by about 30%. These findings are significant contributions to several different communities, including Information Retrieval and Machine Translation.</Paragraph>
    <Paragraph position="1"> Future work includes improving the word-cluster ratio and absorbing more of the single-word clusters into existing clusters or other single-word clusters. We are also considering enrichment of the clusters with types of derivational relations such as &amp;quot;nominal-event&amp;quot; or &amp;quot;doer&amp;quot; to complement part-of-speech labels. Other lexical semantic features such telicity, sentience and change-of-state can also be induced from morphological cues (Light, 1996).</Paragraph>
    <Paragraph position="2"> Additionally, we are interested in measuring the applied contribution of using the CatVar in natural-language applications such as Information Retrieval. And finally, we intend to incorporate CatVar into new applications such as parallel corpus word alignment.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML