File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-2125_intro.xml
Size: 4,842 bytes
Last Modified: 2025-10-06 14:06:33
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2125"> <Title>Identifying Syntactic Role of Antecedent in Korean Relative Clause Using Corpus and Thesaurus Information</Title> <Section position="3" start_page="0" end_page="757" type="intro"> <SectionTitle> 2 Problems and Related Work </SectionTitle> <Paragraph position="0"> In English, it is possible to recognize the syntactic role of antecedents by their position (trace) in relative clauses and the valency information of verbs. For example, the syntactic role of an antecedent man can be recognized as subject of the relative clause in a sentence &quot;He is the man who lives next door&quot; and as object in a sentence &quot;He is the man whom I met.&quot; The relative pronouns such as who, whom, that, whose, and which can also be used in identifying the role of antecedents in relative clauses.</Paragraph> <Paragraph position="1"> However, it is not a trivial work to identify the syntactic role of antecedents in Korean relative clauses. Korean is such a head final language that the antecedent comes after the relative clause. The rest of this section will describe three main characteristics of Korean relative clauses that make it difficult to determine the syntactic role of their antecedents. The first characteristic is that unlike English, Korean lacks relative words corresponding to English ending follows its verb stem of a relative clause modifying an antecedent. The adnominal verb ending does not provide any information about the syntactic role of antecedent. For example, the relative clause kang-eyse hulu- (flow in a river) in sentence (1) modifies the antecedent mwul- (water), while adnominal verb ending nun provides no clue about the syntactic role of the antecedent mwul (water). Figure 1 shows the syntactic dependency tree (SDT) of sentence (1). We need to decide the syntactic role of the antecedent mwul- (water) in the argument structure of the verb hulu- (flow) when applying case frames of the verb for structural disambiguation. The dependency parser (Lee, 1995) only gives the syntactic relation mod between them, which should be regarded as subject in the relative clause.</Paragraph> <Paragraph position="2"> (1) nanun kang-eyse hulu-nun mwul-lul poattta. null (I saw water that flowed in a river.) As the second characteristic, the syntactic role of an antecedent cannot be determined by word order. This is because Korean is a relatively free word-order language like Japanese, Russian, or Finnish, and also because some arguments of a verb may be frequently omitted.</Paragraph> <Paragraph position="3"> In sentence (2), for example, the verb of relative clause nolay-lul pwulless-ten (where \[I\] sang a song \[at the place\]) have two arguments \[I\] and \[place\] omitted. Thus, the antecedent kos(place) might be identified as subject or adverbial in the relative clause.</Paragraph> <Paragraph position="4"> (2) nolay-lul pwulless-ten kos-ey na-nun kassta. null (I went to the place where \[I\] sang a song \[at the place\].) The third characteristic of Korean relative clauses is that the case particle of an antecedent, that indicates the syntactic role in the relative clause, is omitted during relativization. In fact, in a relatively free-word order language, the case particles are very important to the syntactic role determination.</Paragraph> <Paragraph position="5"> Due to lack of syntactic clues, it is very difficult to construct general rules for identifying the syntactic role of antecendents. Thus, the corpus-based method has been prefered to the rule-based one in solving the problem of syntactic role determination in Korean relative clauses. Yang and Kim (1993) proposed a corpus-based method, where, for each noun/verb pair, its word co-occurrence and sub-categorization scores are extracted at lexical level. Park and Kim (1997) described a method of semantic role determination of antecedents using verbal patterns and statistic information from a corpus. These word co-occurrence patterns are all at lexical-level, so we have to construct a large amount of word co-occurrence patterns and statistical information before applying to a real large-scale problem. Actually, the system performance mainly relies on the domain of application, the number of word co-occurrence patterns extracted, and the size of corpus.</Paragraph> <Paragraph position="6"> In the following sections, we will describe an approach to acquiring statistical information at conceptual level rather than at lexical level from a corpus using conceptual hierarchy in the Kadokawa thesaurus titled New Synonym Dictionary (Ohno and Hamanishi, 1981), and also describe a method of syntactic role determination using the extracted knowledge. The system architecture is shown in Figure 2.</Paragraph> </Section> class="xml-element"></Paper>