File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-1721_intro.xml
Size: 766 bytes
Last Modified: 2025-10-06 14:02:08
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1721"> <Title>Chinese Word Segmentation Using Minimal Linguistic Knowledge</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> At the first Chinese word segmentation bakeoff, we participated in the closed track using the Academia Sinica corpus (a0a2a1 for short) and the Beijing University corpus (a3a5a4 for short). We will refer to the segmented texts in the training corpus as the training data, and to both the unsegmented testing texts and the segmented texts (the reference texts) as the testing data. For details on the word segmentation bakeoff, see (Sproat and Emerson, 2003).</Paragraph> </Section> class="xml-element"></Paper>