XML Viewer - p03-1060

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/p03-1060_evalu.xml
Size: 6,866 bytes
Last Modified: 2025-10-06 13:58:57
<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-1060">
  <Title>A Syllable Based Word Recognition Model for Korean Noun Extraction</Title>
  <Section position="6" start_page="5" end_page="6" type="evalu">
    <SectionTitle>
5 Experiments
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="5" end_page="5" type="sub_section">
      <SectionTitle>
5.1 Experimental environment
</SectionTitle>
      <Paragraph position="0"> We used ETRI POS tagged corpus of 288,269 Eojoels for testing and the 21st Century Sejong Project's POS tagged corpus (Sejong corpus, for short) for training. The Sejong corpus consists of three different corpora acquired from 1999 to 2001.</Paragraph>
      <Paragraph position="1"> The Sejong corpus of 1999 consists of 1.5 million Eojeols and other two corpora have 2 million Eojeols respectively. The evaluation measures for the noun extraction task are recall, precision, and Fmeasure. They measure the performance by document and are averaged over all the test documents. This is because noun extractors are usually used in the fields of applications such as information retrieval (IR) and document categorization. We also consider the frequency of nouns; that is, if the noun frequency is not considered, a noun occurring twice or more in a document is treated as other nouns occurring once. From IR point of view, this takes into account of the fact that even if a noun is extracted just once as an index term, the document including the term can also be retrieved.</Paragraph>
      <Paragraph position="2"> The performance considerably depends on the following factors: the representation schemes for word boundary detection, the tagset, the amount of training data, and the difference between training data and test data.</Paragraph>
      <Paragraph position="3"> First, we compare four different representation schemes (BI, BIS, IE, IES) in word boundary detection as explained in Section 4. We try to use the following three kinds of tagsets in order to select the most optimal tagset through the experiments: Tagset 1 Simply use two tags (e.g. noun and nonnoun). This is intended to examine the syllable characteristics; that is, which syllables tend to belong to nouns or not.</Paragraph>
      <Paragraph position="4"> Tagset 2 Use the tagset used in the training data without modification. ETRI tagset used for training is relatively smaller than that of other tagsets. This tagset is changeable according to the POS tagged corpus used in training.</Paragraph>
      <Paragraph position="5"> Tagset 3 Use a simplified tagset for the purpose of noun extraction. This tagset is simplified by combining postpositions, adverbs, and verbal suffixes into one tag, respectively. This tagset is always fixed even in a different training corpus.</Paragraph>
      <Paragraph position="6"> Tagset 2 used in Section 5.2 and Tagset 3 are represented in Table 2.</Paragraph>
    </Section>
    <Section position="2" start_page="5" end_page="5" type="sub_section">
      <SectionTitle>
5.2 Experimental results with similar data
</SectionTitle>
      <Paragraph position="0"> We divided the test data into ten parts. The performances of the model are measured by averaging over  the ten test sets in the 10-fold cross-validation experiment. Table 3 shows experimental results according to each representation scheme and tagset. In the first column, each number denotes the tagset used. When it comes to the issue of frequency, the cases of considering frequency are better for precision but worse for recall, and better for F-measure. The representation schemes using single syllable information (e.g. &amp;quot;BIS&amp;quot;, &amp;quot;IES&amp;quot;) are better than other representation schemes (e.g. &amp;quot;BI&amp;quot;, &amp;quot;IE&amp;quot;). Contrary to our expectation, the results of Tagset 2 consistently outperform other tagsets. The results of Tagset 1 are not as good as other tagsets because of the lack of the syntactic context. Nevertheless, the results reflect the usefulness of the syllable based processing. The changes of the F-measure according to the tagsets and the representation schemes reflecting frequency are shown in Figure 2.</Paragraph>
      <Paragraph position="1"> 5.3 Experimental results with different data To show the influence of the difference between the training data and the test data, we have performed the experiments on the Sejong corpus as a training data and the entire ETRI corpus as a test data. Table 4 shows the experimental results on all of the three training data. Although more training data are used in this experiment, the results of Table 3 shows better outcomes. Like other POS tagging models, this indicates that our model is dependent on the text domain. null  Figure 3 shows the changes of the F-measure according to the size of the training data. In this figure, &amp;quot;99-2000&amp;quot; means 1999 corpus and 2000 corpus are used, and &amp;quot;99-2001&amp;quot; means all corpora are used as the training data. The more training data are used, the better performance we obtained. However, the improvement is insignificant in considering the amount of increase of the training data.</Paragraph>
      <Paragraph position="2"> Results reported by Lee et al. (2001) are presented in Table 5. The experiments were performed on the same condition as that of our experiments.</Paragraph>
      <Paragraph position="3"> NE2001, which is a system designed only to extract nouns, improves efficiency of the general morphological analyzer by using positive and negative information about occurrences of nouns. KOMA (Lee et al., 1999b) is a general-purpose morphological analyzer. HanTag (Kim et al., 1998) is a POS tagger, which takes the result of KOMA as input. According to Table 5, HanTag, which is a POS tagger, is an optimal tool in performing noun extraction in terms of the precision and the F-measure. Although the best performance of our proposed model (BIS-2) is worse than HanTag, it is better than NE2001 and KOMA.</Paragraph>
    </Section>
    <Section position="3" start_page="5" end_page="6" type="sub_section">
      <SectionTitle>
5.4 Limitation
</SectionTitle>
      <Paragraph position="0"> As mentioned earlier, we assume that morphological variations do not occur at any inflected words. However, some exceptions might occur in a colloquial text. For example, the lexical level forms of two Eojeols &amp;quot;C5BM(ddai)+DXAFC0(neun)&amp;quot; and &amp;quot;E3EMBQCW(gogai)+CKA7EM(leul)&amp;quot; are changed into the surface level forms by contractions such as &amp;quot;AGEXD4(ddain)&amp;quot; and &amp;quot;E3EM FOE3D5(go-gail)&amp;quot;, respectively. Our models alone cannot deal with these cases. Such exceptions, however, are very rare.</Paragraph>
      <Paragraph position="1">  In these experiments, we do not perform any post-processing step to deal with such exceptions. null</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML