File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-1217_intro.xml

Size: 1,072 bytes

Last Modified: 2025-10-06 14:01:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1217">
  <Title>How Should a Large Corpus Be Built?-A Comparative Study of Closure in AAnnotated Newspaper Corpora from Two Chinese Sources, Towards Building A Larger Representative Corpus Merged from Representative Sublanguage Collections</Title>
  <Section position="3" start_page="116" end_page="116" type="intro">
    <SectionTitle>
2 Overview
</SectionTitle>
    <Paragraph position="0"> This work applies the methodology of McEnery and Wilson to examine closure rates in a comparative study of all available tagged Chinese newspaper corpora. First I define lexical and syntactic closure for this study in section 3.</Paragraph>
    <Paragraph position="1"> Then, section 4 begins this study with an examination of ~ the newspaper texts of the Academica Sinica Balanced Corpus (ASBC). Section 5 extends this study to an examination of the newspaper texts of the UPenn Chinese Tree-bank (CTB). Section 6 presents my findings and section 7 discusses some implications for future corpus building.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML