File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-1725_concl.xml

Size: 819 bytes

Last Modified: 2025-10-06 13:53:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-1725">
  <Title>A Unicode based Adaptive Segmentor</Title>
  <Section position="7" start_page="3" end_page="3" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> In this paper, design and algorithms of a generalpurposed Unicode based segmentor is proposed. It is able to process Simplified and Traditional Chinese appear in the same text. Sophisticated pre-processing and other auxiliary modules help segmenting text more accurately. User interactions and modules can be easily added with the help of its modular design. A built-in new word extractor is also implemented for extracting new words from running text. It saves much time on training and thus it can be quickly adapted to new environments.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML