File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2117_metho.xml

Size: 5,981 bytes

Last Modified: 2025-10-06 14:13:44

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-2117">
  <Title>ON THE AN EMPIRICAL STUDY GENERATION OF ZERO ANAPHORS IN CHINESE</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ON THE
AN EMPIRICAL STUDY
GENERATION OF ZERO ANAPHORS IN CHINESE
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
Britain
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper, we describe the creation of rules for generating Chinese zero anaphors through a sequence of experiments in a stepwise enhanced manner. In the experiments, we basically examined the occurrence of zero anaphors in a real text and the ones generated by the algorithms employing the rules, assuming the same semantic and discourse structures as the text. The factors of locality, syntactic constraints, discourse structure and salience of objects were considered in the rules. The results of the experiment show that 93% of the zero anaphors in the text can be correctly generated by an algorithm using a rule involving all the above factors.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Anaphoric expressions in Chinese can be classified as zero, pronominal and nominal forms, as exemplified in  (1) by C/~, ta I (tie) and nage ren I (that person), respectively. null (1)a. Zhangsan I jinghuang de paokai, Zhangsan frightened NOM to run-away Zhangsan was frightened and ran away.</Paragraph>
    <Paragraph position="1"> b. C/i zhuangdau yige dahan/, 1 (lie) bump-to a big-man (He) ran into a big man.</Paragraph>
    <Paragraph position="2"> c. t a i kanqing lena ren J de zhangxiang, lie see-clear ASP that man GEN appearance He watched clearly that man's appearance. d. C/~ renchu na renJ shi shel.</Paragraph>
    <Paragraph position="3"> (he) recognize that man is who (He) recognized who that man is.</Paragraph>
    <Paragraph position="4">  In their paper ILl and Thompson 79\], Li and Thompson have shown that zero anaphors in Chinese can occur in any grammatical slot with an antecedent that may occur in any grammatical slot, regardless of *Also with the Department of Information Engineering at Tatung Institute of Technology, Taipei, Taiwan. Email address is chingyeh@aisl).ed.ac.uk.</Paragraph>
    <Paragraph position="5"> tEmail address is chrism@aisb.ed.ac.uk.</Paragraph>
    <Paragraph position="6"> the distance between them. Although there is no clear rule to account for zero anaphora, nevertheless, as pointed out by Li and Thompson, zero anaphora cominonly occur in the situation of a &amp;quot;topic chain,&amp;quot; where a referent is referred to in the first clause, and then several more clauses follow talking about the same refercut but with it omitted. In \[Chen 87\], Chen proposed the notion of &amp;quot;contimfity&amp;quot; of referent in discourse to give a more specific account of zero auaphora.</Paragraph>
    <Paragraph position="7"> In this paper, we aim at deciding when to generate zcro anaphors from some internal semantic strueturc. Although there are no clear rules stated in previous linguistic work, we, nevertheless, cau summarize a very simple rule, R.ule 1 as shown below, for the generation of zero anaphors.</Paragraph>
    <Paragraph position="8"> Rule 1: if an entity, c, in the current utterance was referred to in the immediately preceding utterance, then a zero anaplior is used for c; otherwise a non-zero anaphor is used.</Paragraph>
    <Paragraph position="9"> We, performed at\] experiment by comparing the zero anaphors generated by the algorithm employing this rule and those occurring in real text to see how well it works. The initial result showed that zero anaphors were over-generated to a large extent in the text produced by employing Rule l. Consequently, we considered other well-known factors namely, syntactic constraints \[Li and Thompson 81\], discourse structure \[Grosz and Sidner 86\] and the salience ofobjecl, s in ut teranees \[Sidner 83\], to get better results.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="732" type="metho">
    <SectionTitle>
2 Experiment
</SectionTitle>
    <Paragraph position="0"> A number of articles written by different authors were selected as the linguistic sources with which the text produced by employing the generat, ion algorithms can be compared. For the moment, the selected articles are restricted to the exposition type, namely, ones which explain an idea or discuss a problem. Two sets of data were selected; one consists of a nunlber of scientiff(: questions and answers for children and the ol~her is a brief introduction to modern Chinese grammar.</Paragraph>
    <Paragraph position="1"> Basically the experiment was executed in three steps.</Paragraph>
    <Paragraph position="2"> First, zero anaphors within the selected articles were identiffed. Second, h)r each paragraph in the selected  articles, we examined each utterance sequentially nnd recorded the occurrence of zero anaphors that wouhl be obtained by applying the algorithm using a rule, like Rule 1. Third, we noted down the dilfcrenees between the results of stel)s 1 nnd 2.</Paragraph>
    <Paragraph position="3"> In step 3, we categorized the differences betweea the results ms: correct, fitlse and missiug types. If a referenee created by the algorithm is the same as tile one in the real text, then it belongs to the correct type.</Paragraph>
    <Paragraph position="4"> If a zero anat&gt;hor is created by the algorithm, while the corresponding position in the real text is non-zero annphol', then it belongs to the false type. Conversely, if a zero nnaphor is found in some position in the real text, while a non-zero anaphor is created by the algo= rithm, then it t&gt;elongs to the missiT~g type. The task of step 3 is to eonnt the re,tuber of cases in each type.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML