File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/96/c96-1087_relat.xml

Size: 1,376 bytes

Last Modified: 2025-10-06 14:16:06

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-1087">
  <Title>A Probabilistic Approach to Compound Noun Indexing in Korean Texts</Title>
  <Section position="3" start_page="514" end_page="514" type="relat">
    <SectionTitle>
2 Related Work
</SectionTitle>
    <Paragraph position="0"> The previous approaches to compound noun indexing are based either on full scale morphological analysis (Kang :1995; Kim 1983; Lee 1995; Seo 1993) or on the syllabic patterns (Fujii 1993 ; Lee 1996; Ogawa 1993). Morphological analysis will return morphologically valid component words constituting a given compound word. Since this method does not exclude invalid or meaningless words, it can result in the degradatiou of precision. Besides the employment of full morphological analysis is often too expensive and requires costly maintenance.</Paragraph>
    <Paragraph position="1"> Simpler methods segment componnd nouns mechanically into unigram or bigram words that are all regarded as index terms (Lee 1996). Bigram indexes shows better precision than unigrams, but can suffer from big index size. In general, the existing methods for compound noun analysis have been focused mainly on recall performance with little attention to the precision. The work presented in this paper t'ries to achieve the improvement of recall without the deterioration of preci-</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML