File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/02/w02-1117_abstr.xml
Size: 1,054 bytes
Last Modified: 2025-10-06 13:42:37
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-1117"> <Title>A Character-net Based Chinese Text Segmentation Method</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> The segmentation of Chinese texts is a key process in Chinese information processing.</Paragraph> <Paragraph position="1"> The difficulties in segmentation are the process of ambiguous character string and unknown Chinese words. In order to obtain the correct result, the first is identification of all possible candidates of Chinese words in a text. In this paper, a data structure Chinese-character-net is put forward, then, based on this character-net, a new algorithm is presented to obtain all possible candidate of Chinese words in a text. This paper gives the experiment result. Finally the characteristics of the algorithm are analysed.</Paragraph> <Paragraph position="2"> Keywords: segmentation, connection, character-net, ambiguity, unknown words.</Paragraph> </Section> class="xml-element"></Paper>