File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/p03-1060_concl.xml
Size: 2,244 bytes
Last Modified: 2025-10-06 13:53:35
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-1060"> <Title>A Syllable Based Word Recognition Model for Korean Noun Extraction</Title> <Section position="7" start_page="6" end_page="6" type="concl"> <SectionTitle> 6 Conclusion </SectionTitle> <Paragraph position="0"> We have presented a word recognition model for extracting nouns. While the previous noun extraction Actually, about 0.145% of nouns in the test data belong to these cases.</Paragraph> <Paragraph position="1"> methods require morphological analysis or POS tagging, our noun extraction method only uses the syllable information without using any additional morphological analyzer. This means that our method does not require any dictionary or linguistic knowledge. Therefore, without manual labor to construct and maintain those resources, our method can extract nouns by using only the statistics, which can be automatically extracted from a POS tagged corpus.</Paragraph> <Paragraph position="2"> The previous noun extraction methods take a morpheme as a processing unit, but we take a new notion of word as a processing unit by considering the fact that nouns belong to uninflected morphemes in Korean. By virtue of the new definition of a word, we need not consider mismatches between the surface level form and the lexical level one in recognizing words.</Paragraph> <Paragraph position="3"> We have performed various experiments with a wide range of variables influencing the performance such as the representation schemes for the word boundary detection, the tag set, the amount of training data, and the difference between the training data and the test data. Without morphological analysis or POS tagging, the proposed method achieves comparable performance compared with the previous ones. In the future, we plan to extend the context to improve the performance.</Paragraph> <Paragraph position="4"> Although the word recognition model is designed to extract nouns in this paper, the model itself is meaningful and it can be applied to other fields such as language modeling and automatic word spacing.</Paragraph> <Paragraph position="5"> Furthermore, our study make some contributions in the area of POS tagging research.</Paragraph> </Section> class="xml-element"></Paper>