File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/i05-3004_intro.xml

Size: 2,066 bytes

Last Modified: 2025-10-06 14:02:57

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-3004">
  <Title>Chinese Classifier Assignment Using SVMs</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In English, numbers directly modify count nouns, as in 'two apples' and 'five computers'. Numbers cannot directly modify mass nouns; instead, an embedded noun phrase must be formed, e.g.</Paragraph>
    <Paragraph position="1"> 'five slices of bread'. However, in Chinese all nouns need numeral classifiers to express quantity1. When translating from English to Chinese, we may need to choose Chinese classifiers to form noun phrases. We can see the difference between the two languages in the following two examples:</Paragraph>
    <Paragraph position="3"> five slices of bread (English) Noun classifer combinations appear with high frequency in Chinese. There are more than 500 classifiers although fewer than 200 of them are frequently used. Each classifier can only be 1Proper nouns and bare noun phrases do not need classifiers. null used with certain classes of noun. Nouns in a class usually have similar properties. For example, nouns that can be used with the classifier '_d_3495[gen]' are: '_d_4919_d_5718'(straw), '_d_5022_d_2258'(chopstick), '_d_5035_d_2258'(pipe), etc. All these objects are long and thin. However, sometimes nouns with similar properties are in different classes. For example, '_d_4259'(cow), '_d_7406'(horse) and '_d_5306'(lamb) are all livestock, but they associate with different classifiers. This means that classifier assignment is not totally rule-based but partly idiomatic.</Paragraph>
    <Paragraph position="4"> In this paper, we explore the relationship between classifiers and nouns. We extract a set of features and the corresponding noun-classifier attachments from a corpus and then train SVMs to assign classifers to nouns. In Section 4 we describe our data set. In Section 5 we describe our experiments. In Section 6 we present our results.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML