File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/i05-3033_concl.xml

Size: 1,314 bytes

Last Modified: 2025-10-06 13:54:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-3033">
  <Title>Towards a Hybrid Model for Chinese Word Segmentation</Title>
  <Section position="5" start_page="190" end_page="191" type="concl">
    <SectionTitle>
4 Conclusions
</SectionTitle>
    <Paragraph position="0"> We described a hybrid Chinese word segmenter that combines the transformation-based learning algorithm for character-based tagging and linguistic heuristics for transforming tagged character sequences into word-segmented sentences.</Paragraph>
    <Paragraph position="1">  As the segmenter is in its first stage of development and is far from mature, the bakeoff provided an especially valuable opportunity for evaluating its performance. The results suggest that: 1. Despite the lack of a separate mechanism for unknown word recognition, the segmenter performed relatively well on OOV words. This confirms our hypothesis that character-based tagging has a good potential for improving Chinese unknown word identification.</Paragraph>
    <Paragraph position="2"> 2. Using linguistic heuristics at the merging stage can help improve segmentation results.</Paragraph>
    <Paragraph position="3"> 3. There is much room for improvement for both the tagging algorithm and the merging algorithm. This is being undertaken. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML