File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-0131_abstr.xml

Size: 942 bytes

Last Modified: 2025-10-06 13:45:18

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-0131">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics POC-NLW Template for Chinese Word Segmentation</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper, a language tagging template named POC-NLW (position of a character within an n-length word) is presented. Based on this template, a two-stage statistical model for Chinese word segmentation is constructed. In this method, the basic word segmentation is based on n-gram language model, and a Hidden Markov tagger based on the POC-NLW template is used to implement the out-of-vocabulary (OOV) word identification. The system participated in the MSRA_Close and UPUC_Close word segmentation tracks at SIGHAN Bakeoff 2006. Results returned by this bakeoff are reported here.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML