File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-2211_evalu.xml
Size: 2,921 bytes
Last Modified: 2025-10-06 13:59:15
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2211"> <Title>Speech/Language Technology Research, ETRI</Title> <Section position="6" start_page="0" end_page="0" type="evalu"> <SectionTitle> 5 Evaluation </SectionTitle> <Paragraph position="0"> The 114,581 verb patterns we have constructed for 3 years were used as seed patterns for semi automatic generation of patterns. After the steps 1 and 2 of the generation process were finished, the sets of possible synonymous verbs were constructed. To filter out the wrong synonym sets, the whole sets were examined by two lexicographers. It took a week for two lexicographers to complete this process. The wrong synonym sets were produced mainly due to the homonymy of Chinese verbs.</Paragraph> <Paragraph position="1"> From the original 114,581 patterns, we generated 235,975 patterns. We performed two evaluations with the generated patterns. In the first evaluation, we were interested in finding out how many correct patterns were generated.</Paragraph> <Paragraph position="2"> The second evaluation dealt with the improvement of the pattern matching ratio due to the increased number of patterns.</Paragraph> <Paragraph position="3"> Evaluation 1 In the first evaluation we randomly selected 3,086 patterns that were generated from 30 Chinese verbs. The expert Korean-Chinese lexicographers examined the generated patterns. Among the 3,086 patterns, 2,180 were correct.</Paragraph> <Paragraph position="4"> The accuracy of the semi-automatic generation was 70.65%. Although the evaluation set was relatively small in size, the accuracy rate seemed to be quite promising, considering there still remain other filtering factors that can be taken into account additionally.</Paragraph> <Paragraph position="5"> The majority of the erroneous patterns can be classified into the following two error types: The verbs share similar meanings and selectional restrictions on the arguments.</Paragraph> <Paragraph position="6"> However, they differ in selecting the case markers for argument positions (the most prominent error).</Paragraph> <Paragraph position="7"> Ex) ~eykey masseta/ ~wa taykyelhata (to face somebody) The verbs share similar meanings, but the selectional restrictions are different.</Paragraph> <Paragraph position="8"> In the second evaluation, our interest was to find out how much improvement of pattern matching ratio can be achieved with the increased number of patterns in comparison to the original pattern DB. For the evaluation, 300 sentences were randomly extracted from various Korean newspapers. The test sentences were about politics, economics, science and sports. In the 300 sentences there were 663 predicates. With the original verb pattern DB, i.e. with 114,581 patterns, the perfect pattern matching ratio was 59.21%, whereas the perfect matching ratio rose to 64.40% with the generated pattern</Paragraph> </Section> class="xml-element"></Paper>