File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/w03-0104_evalu.xml
Size: 2,025 bytes
Last Modified: 2025-10-06 13:58:59
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-0104"> <Title>GeoName: a system for back-transliterating pinyin place names</Title> <Section position="5" start_page="3" end_page="10" type="evalu"> <SectionTitle> 5 Evaluation of GeoName </SectionTitle> <Paragraph position="0"> To evaluate the performance of GeoName, we need to test a set of Chinese place names in English Pinyin and compare the output from GeoName with the known Chinese characters for each name. In essence, we need another bi-list for testing, independent of the List-A that we used for training. Bilingual lists are difficult to obtain. Eventually a bilingual map (Map of Peoples' Republic of China 2001) with both Chinese and English names printed was located. The test set consists of 162 non-capital city names randomly selected from the map, six from each of the twenty-seven provinces excluding Taiwan (where some names are in Wade-Giles convention). The rank position of the correct Chinese name for each Pinyin returned from GeoName was noted within top ten; else it is considered a failure. We tested four settings of the tag values, viz.: 000 (only frequency prediction), 001 (frequency and web confirmation), 010 (frequency and monolingual list confirmation), and 111 (full function). A tabulation of the number of correct names found vs. rank position is shown in Table 1.</Paragraph> <Paragraph position="1"> The result with tag=000 (using frequency only for candidate suggestion) shows that 78 candidates out of 162 (48%) are correct at rank 1, and 133 (82%) correct within top 10. Both runs with tag=001 (add WWW confirmation) or tag=010 (add monolingual List-A U List-B confirmation) improves over tag=000 results, especially at rank 1, bringing this percentage to 59% and 54% respectively. Web confirmation is expensive in processing time, and may be variable depending on the state of the Web. Monolingual list confirmation is useful, especially when one has a list that is more region-specific</Paragraph> </Section> class="xml-element"></Paper>