File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-1806_concl.xml

Size: 2,232 bytes

Last Modified: 2025-10-06 13:53:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-1806">
  <Title>PCFG Parsing for Restricted Classical Chinese Texts</Title>
  <Section position="5" start_page="10" end_page="10" type="concl">
    <SectionTitle>
Accuracies
</SectionTitle>
    <Paragraph position="0"> Figure 3 shows the distribution of sentences and parsing accuracies for different sentence lengths. For distribution, we can see that those 4-word, 5-word, and 6-word sentences constitute for the majority of the corpus, while those 1-word and 2-word sentences are very few. For accuracy, the parser is more effective for shorter sentences than for longer sentences.</Paragraph>
    <Paragraph position="1"> And for 1-word and 2-word sentences, there is no error report from the parse results.</Paragraph>
    <Paragraph position="2"> Conclusion Computer processing of Classical Chinese has just been commencing. While Classical Chinese is generally considered too difficult to process, our previous work on part-of-speech tagging has been largely successful because there is almost no need to segment Classical Chinese words. And we continue to use the tagset and corpus into this work. We first apply the forward-backward algorithm to obtain the context-dependent probabilities. The PCFG model is then presented where we restrict the rules into binary/unary rules, which greatly simplifies our parsing programming. According to the model, we developed a CFG rule-set of Classical Chinese. Some special features of the set are also studied. Classical Chinese processing is generally considered too difficult and thus neglected, while our works have shown that by good modelling and proper techniques, we can still get encouraging results. Although Classical Chinese is currently a dead language, our work still has applications in those areas as Classical-Modern Chinese Translation.</Paragraph>
    <Paragraph position="3"> For future work of this paper, we expect to incorporate trigram model into the forward-backward algorithm, which will increase the tagging accuracy. And most important of all, it is obvious that the state-of-the-art PCFG model is still two-leveled, we expect to devise a three-level model, just like trigram versus bigram.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML