File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-2211_metho.xml
Size: 8,754 bytes
Last Modified: 2025-10-06 14:09:25
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2211"> <Title>Speech/Language Technology Research, ETRI</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 350,000 pattern DB. 2 Related Works </SectionTitle> <Paragraph position="0"> When constructing verb pattern dictionary, too much dependence on the linguistic intuition of lexicographers can lead to the inconsistency and the incompleteness of the pattern dictionary.</Paragraph> <Paragraph position="1"> Similar problems are encountered when working with a paper dictionary due to the insufficient examples. Hong et al (2002) introduced the concept of causative/passive linking to Korean word dictionary. The active form 'mekta (to eat)' is linked to its causative/passive forms 'mekita (to let eat)', and 'mekhita (to be eaten)', respectively. The linking information of this sort helps lexicographers not to forget to construct verb patterns for causative/passive verbs when they write a verb pattern for active verbs. The semi-automatic generation of verb patterns using translation equivalency was tried in Hong et al (2002). However, as only the voice information was used as a filter, the over-generation problem is serious.</Paragraph> <Paragraph position="2"> Fujita & Bond (2002) and Bond & Fujita (2003) introduced the new method of constructing a new valency entry from existing entries for Japanese-English MT. Their method creates valency patterns for words in the word dictionary whose English translations can be found in the valency dictionary. The created valency patterns are paraphrased using monolingual corpus. The human translators check the grammaticality of the paraphrases.</Paragraph> <Paragraph position="3"> Yang et al. (2002) used passive/causative alternation relation for semi-automatic verb pattern generation. Similar works have been done for Japanese by Baldwin & Tanaka (2000) and Baldwin & Bond (2002) .</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Verb Pattern in TELLUS K-C </SectionTitle> <Paragraph position="0"> The term 'verb pattern' is understood as a kind of subcategorization frame of a predicate.</Paragraph> <Paragraph position="1"> However, a verb pattern in our approach is slightly different from a subcategorization frame in the traditional linguistics. The main difference between the verb pattern and the subcategorization frame is that a verb pattern is always linked to the target language word (the predicate of the target language). Therefore, a verb pattern is employed not only in the analysis but also in the transfer phase so that the accurate analysis can directly lead to the natural and correct generation. In the theoretical linguistics, a subcategorization frame always contains arguments of a predicate. An adjunct of a predicate or a modifier of an argument is usually not included in it. However, in some cases, these words must be taken into account for the proper translation.</Paragraph> <Paragraph position="2"> In translations adjuncts of a verb or modifiers of an argument can seriously affect the selection of target words. (1) exemplifies verb patterns of &quot;cata (to sleep)&quot;:</Paragraph> <Paragraph position="4"> [param(A)ka cata: The wind has died down] 1 The slot for nominal arguments is separated by a symbol &quot;!&quot; from case markers like &quot;ka&quot;, &quot;lul&quot;, &quot;eykey&quot;, and etc. The verb is also separated by the symbol into the root and the ending.</Paragraph> <Paragraph position="5"> cata2 : A=HUMAN!ka ca!ta > A -5 ? :v [ai(A)ka cata: A baby is sleeping] cata 3 : A=WATCH! ka ca!ta > A 0 :v [sikye(A)ka cata: A watch has run down] cata 4 : A=PHENOMENA!ka ca!ta > A G M- :v [phokpwungwu(A)ka cata: The storm has abated] On the left hand of &quot;>&quot; Korean subcategorization frame is represented. The argument position is filled with a variable (A, B, or C) equated with a semantic feature (WEATHER, HUMAN, WATCH, PHENOMENA). Currently we employ about 410 semantic features for nominal semantic classifications. The Korean parts of verb patterns are employed for syntactic parsing.</Paragraph> <Paragraph position="6"> On the right hand of &quot;>&quot; Chinese translation is given with a marker &quot;:v&quot;. To every pattern is attached an example sentence for better comprehensibility of the pattern. This part serves for the transfer and the generation of Chinese sentence.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Pattern Construction based on </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Chinese Translation </SectionTitle> <Paragraph position="0"> In this chapter, we elaborate on the method of semi-automatic construction of Korean-Chinese verb patterns. Our method is similar to that of Fujita & Bond (2002) and inspired by it as well, i.e. it makes most use of the existing resources.</Paragraph> <Paragraph position="1"> The existing resources are in this case verb patterns that have already been built manually.</Paragraph> <Paragraph position="2"> As every Korean verb pattern is provided with the corresponding Chinese translation, Korean verb patterns can be re-sorted to Chinese translations. The basic assumption of this approach is that the verbs with similar meanings tend to have similar case frames, as is pointed out in Levin (1993). As an indication to the similarity of meaning among Korean verbs, Chinese translation can be employed. If two verbs share Chinese translation, they are likely to have similar meanings. The patterns that have translation equivalents are seed patterns for automatic pattern generation.</Paragraph> <Paragraph position="3"> Our semi-automatic verb pattern generation method consists of the following four steps: Step1: Re-sort the existing Korean-Chinese verb patterns according to Chinese verbs exists in the verb pattern dictionary, it is discarded.</Paragraph> <Paragraph position="4"> The three conditions to be met in the third step are the filters to prevent the over-generation of patterns. The following examples shows why the first condition, i.e., &quot;the voice of the verbs in question must agree&quot;, must be met.</Paragraph> <Paragraph position="6"> ttuta: A leaf is floating on the watera42 ttiwuta : A=HUMAN!ka B=PLACE!ey C=PLANT!lul ttiwu!ta > A a43 C a32 :v a37 B a40 [ai(A)ka mwulwi(B)ey namwutip(C)ul ttiwuta: A baby floated a leaf on the water]</Paragraph> <Paragraph position="8"> (A)un yak(B)ul hambwulo sayonghanta: Koreans are misusing the drug] As we re-sort the existing patterns according to the Chinese verbs which are marked with &quot;:v&quot;, the verbs of different voice may be gathered together. However, as the above examples show, the voice (active vs. causative in (2), passive vs. active in (3)) affects the argument structure of verbs. We conclude that generating patterns without considering the voice information can lead to the over-generation of patterns. The voice information of verbs can be obtained from the linking information between the verb pattern dictionary and the word dictionary. We will not look into the details of the linking relation between the verb pattern dictionary and the word dictionary of TELLUS K-C system in this paper. cf. Hong et al. (2002) The second condition relates to the lexical patterns of Korean. Lexical patterns are used for collocational expressions. As the nature of collocation implies, a predicate that shows a strict co-occurrence relation with a certain nominal argument cannot be arbitrarily combined with any other nouns.</Paragraph> <Paragraph position="9"> The third condition deals with the support verb construction of Chinese. The four verbs, t ,, Efl > . 0 belong to the major verbs in Chinese that form support verb construction with predicative nouns. In support verb construction, the argument structure of the sentence is not determined by a verb but by a predicative noun. Because of this, the same Chinese translation cannot be the indication of similar meaning of Korean verbs, as followed: wuntonghanta: He is exercising in the gymnasium] Although the Korean verbs &quot;ttallangkelita (to ring)&quot;, &quot;ssawuta (to fight)&quot;, &quot;wuntonghata (to exercise)&quot; share the Chinese verb &quot;a58 &quot;, the argument structure of each Chinese translation is determined by the predicative nouns that are syntactically objects of the verbs.</Paragraph> </Section> </Section> class="xml-element"></Paper>