File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/w99-0635_intro.xml
Size: 1,598 bytes
Last Modified: 2025-10-06 14:07:05
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0635"> <Title>Corpus-Based Approach for Nominal Compound Analysis for Korean Based on Linguistic and Statistical Information</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> Accurate nominal compound analysis is crucial for in application of natural language processing such as information retrieval and extraction as well as nominal compound interpretation. I,n the nominal compound analysis area, some corpus-based approaches have reported successful results by using statistal co-occurrences of nouns. But a nominal compound often has the similar structure to a simple sentence, e.g. the complement-predicate structure, as well as representing compound meaning with several nouns combined. Due to the grammarical characteristics of nominal compounds, the fi'amework based only on statistcal association between nouns often fails to analyze their structures accurately, especially in Korean. This pcper presents a new model for Korean nominal compound analysis on the basis of linguistic and statistical knowledge. The syntactic relations often have an effect on determining the structure of nominal compounds, and we analyzed 40 million word corpus in order to acquire syntactic and s-tatistical knowledge. The structure of a nominal compound is analyzed based on the linguistic lexical information extracted. By experiments, it is shown that our method is effective for accurate analysis of Korean nominal compounds.</Paragraph> </Section> class="xml-element"></Paper>