File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/p06-2108_concl.xml

Size: 2,070 bytes

Last Modified: 2025-10-06 13:55:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-2108">
  <Title>Using Word Support Model to Improve Chinese Input System</Title>
  <Section position="6" start_page="846" end_page="846" type="concl">
    <SectionTitle>
4 Conclusions and Future Directions
</SectionTitle>
    <Paragraph position="0"> In this paper, we present a word support model (WSM) to improve the WP identifier (Tsai, 2005) and support the Chinese Language Processing on the STW conversion problem. All of the WP data can be generated fully automatically by applying the AUTO-WP on the given corpus. We are encouraged by the fact that the WSM with WP knowledge is able to achieve state-of-the-art tonal and toneless STW accuracies of 99% and 92%, respectively, for the identified poly-syllabic words. The WSM can be easily integrated into existing Chinese input systems by identifying words as a post processing. Our experimental results show that, by applying the WSM as an adaptation processing together with the MSIME (a trigram-like model) and the BiGram (an optimized bigram model), the average tonal and toneless STW improvements of the two Chinese input systems are 37% and 35%, respectively.</Paragraph>
    <Paragraph position="1"> Currently, our WSM with the mixed WP database comprised of UDN2001 and AS WP database is able to achieve more than 98% identified character ratios of poly-syllabic words in tonal and toneless STW conversions among the UDN2001 and the AS corpus. Although there is room for improvement, we believe it would not produce a noticeable effect as far as the STW accuracy of poly-syllabic words is concerned.</Paragraph>
    <Paragraph position="2"> We will continue to improve our WSM to cover more characters of the UDN2001 and the AS corpus by those word-pairs comprised of at least one mono-syllabic word, such as &amp;quot;Wo Men (we)-Shi (are)&amp;quot;. In other directions, we will extend it to other Chinese NLP research topics, especially word segmentation, main verb identification and Subject-Verb-Object (SVO) autoconstruction. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML