File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/i05-3004_concl.xml

Size: 2,658 bytes

Last Modified: 2025-10-06 13:54:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-3004">
  <Title>Chinese Classifier Assignment Using SVMs</Title>
  <Section position="8" start_page="29" end_page="30" type="concl">
    <SectionTitle>
7 Conclusions and Future Work
</SectionTitle>
    <Paragraph position="0"> Our machine learning approach to classifier assignment in Chinese performs better than previously published rule-based approaches and works for bigger data sets. The noun is clearly the most important feature (experiment 1). However, we still think ontological features may be useful in classifier assignment, for example for previously unseen nouns, and our experimental results show a trend in this direction, although not a statistically significant one (experiments 2 and 4).</Paragraph>
    <Paragraph position="1"> We used the Chinese Treebank for these experiments because it is the only available corpus of parsed Chinese text. Now that we have isolated the relevant features for this task, we plan to conduct further experiments using larger corpora, such as the Chinese Gigaword (Graf and Chen, 2003).</Paragraph>
    <Paragraph position="2"> Our use of ontological features could be improved in several ways. First, the ontological features we get from HowNet do not fit our purpose well. For example, the definitions of '_d_4340' (cat) and '_d_4259' (cow) are both 'livestock'; however, they should use different classifiers. In order to improve the performance of our approach, we need an ontology that correctly groups nouns into classes according to their semantic properties (e.g. type, shape, color, size).</Paragraph>
    <Paragraph position="3"> For another knowledge-rich approach, we could use a complex ontology plus a Chinese classifier dictionary that describes the properties of the objects each classifier can modify. By comparing noun properties and classifier characteristics, classifier assignment could be improved as long as the nouns are in the ontology. However, there are many idiomatic noun-classifier matchings that can not be categorised by dictionaries. Therefore, a combination of rule- null based and machine-learning approaches seems most promising.</Paragraph>
    <Paragraph position="4"> Third, we can classify Chinese classifers into groups and focus on those that modify single objects. Certain Chinese classifiers can be used before all plural nouns. Some classifiers specify the container of the objects, for example, '_d_948[yi] _d_5061_d_2258[lanzi] _d_5678_d_3428[pingguo]' (a basket of apples). The classifier changes when the container changes. These can be treated differently from sortal and anaphoric classifiers.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML