File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/03/w03-1711_abstr.xml

Size: 1,807 bytes

Last Modified: 2025-10-06 13:43:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-1711">
  <Title>A Chinese Efficient Analyser Integrating Word Segmentation, Part-Of-Speech Tagging, Partial Parsing and Full Parsing</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> This paper introduces an efficient analyser for the Chinese language, which efficiently and effectively integrates word segmentation, part-of-speech tagging, partial parsing and full parsing. The Chinese efficient analyser is based on a Hidden Markov Model (HMM) and an HMM-based tagger. That is, all the components are based on the same HMM-based tagging engine. One advantage of using the same single engine is that it largely decreases the code size and makes the maintenance easy. Another advantage is that it is easy to optimise the code and thus improve the speed while speed plays a critical important role in many applications. Finally, the performances of all the components can benefit from the optimisation of existing algorithms and/or adoption of better algorithms to a single engine. Experiments show that all the components can achieve state-of-art performances with high efficiency for the Chinese language.</Paragraph>
    <Paragraph position="1"> The layout of this paper is as follows. Section 2 describes the Chinese efficient analyser. Section 3 presents the HMM and the HMM-based tagger.</Paragraph>
    <Paragraph position="2"> Sections 4 and 5 describe the applications of the HMM-based tagger in integrated word segmentation and part-of-speech tagging, partial parsing, and full parsing respectively. Section 6 gives the experimental results. Finally, some conclusions are drawn with possible extensions of future work in section 7.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML