File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/p98-2148_evalu.xml

Size: 6,478 bytes

Last Modified: 2025-10-06 14:00:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2148">
  <Title>A Stochastic Language Model using Dependency and Its Improvement by Word Clustering</Title>
  <Section position="6" start_page="901" end_page="902" type="evalu">
    <SectionTitle>
5 Evaluation
</SectionTitle>
    <Paragraph position="0"> We constructed the POS-based dependency model and the class-based dependency model to evaluate their predictive power. In addition, we implemented parsers based on them which calculate the best syntactic tree from a given sequence of bun~etsu to observe their accuracy. In this section, we present the experimental results and discuss them.</Paragraph>
    <Section position="1" start_page="901" end_page="901" type="sub_section">
      <SectionTitle>
5.1 Conditions on the Experiments
</SectionTitle>
      <Paragraph position="0"> As a syntactically annotated corpus we used EDR corpus (Jap, 1993). The corpus was divided into ten parts and the models estimated from nine of them were tested on the rest in terms of cross entropy (see Table 1). The number of characters in the Japanese writing system is set to 6,879. Two parameters which have not been determined yet in the explanation of the models (dmaz and v,naz) axe both set to 1. Although the best value for each of them can also be estimated using the average cross entropy, they are fixed through the experiments.</Paragraph>
    </Section>
    <Section position="2" start_page="901" end_page="902" type="sub_section">
      <SectionTitle>
5.2 Evaluation of Predictive Power
</SectionTitle>
      <Paragraph position="0"> For the purpose of evaluating the predictive power of the models, we calculated their cross entropy on the test corpus. In this process the annotated tree is used as the structure of the sentences in the test corpus. Therefore the probability of each sentence in the test corpus is not the summation over all its possible derivations. In order to compare the POS-based dependency model and the class-based dependency model, we constructed these models from the same learning corpus and calculated their cross entropy on the same test corpus. They are both interpolated with the SCFG with uniform distribution.</Paragraph>
      <Paragraph position="1"> The processes for their construction are as follows:  * POS-based dependency model 1. estimate the interpolation coefficients in Formula (4) by the deleted interpolation method 2. count the frequency of each rewriting rule on the whole learning corpus * class-based dependency model 1. estimate the interpolation coefficients in Formula (4) by the deleted interpolation method 2. calculate an optimal word-class relation by the method proposed in Section 3.</Paragraph>
      <Paragraph position="2"> 3. count the frequency of each rewriting rule  on the whole learning corpus The word-based 2-gram model for bunsetsu generation and the character-based 2-gram model as an unknown word model (Mori and Yamaji, 1997) are common to the POS-based model and class-based model. Their contribution to the cross entropy is constant on the condition that the dependency models contain the prediction of the last word of the content word sequence and that of the function word sequence.</Paragraph>
      <Paragraph position="3"> Table 2 shows the cross entropy of each model on the test corpus. The cross entropy of the class-based dependency model is lower than that of the POS-based dependency model. This result attests experimentally that the class-based model estimated by our clustering method is more predictive than the POS-based model and that our word clustering  language model cross entropy accuracy POS-based model 5.3536 68.77% class-based model 4.9944 81.96% select always 53.10% the next bunsetsu method is efficient at improvement of a dependency model.</Paragraph>
      <Paragraph position="4"> We also calculated the cross entropy of the class-based model which we estimated with a word 2-gram model as the model M in the Formula (5). The number of terminals and non-terminals is 1,148,916 and the cross entropy is 6.3358, which is much higher than that of the POS-base model. This result indicates that the best word-class relation for the dependency model is quite different from the best word-class relation for the n-gram model. Comparing the number of the terminals and non-terminals, the best word-class relation for n-gram model is exceedingly specialized for a dependency model. We can conclude that word-class relation depends on the language model.</Paragraph>
    </Section>
    <Section position="3" start_page="902" end_page="902" type="sub_section">
      <SectionTitle>
5.3 Evaluation of Syntactic Analysis
</SectionTitle>
      <Paragraph position="0"> SVe implemented a parser based on the dependency models. Since our models, equipped with a word-based 2-graan model for bunsetsu generation and the character-based 2-gram as an unknown word model, can return the probability for amy input, we can build a parser, based on our model, receiving a character sequence as input. Its evaluation is not easy, however, because errors may occur in bunsetsu generation or in POS estimation of unknown words. For this reason, in the following description, we assume a bunsetsu sequence as the input.</Paragraph>
      <Paragraph position="1"> The criterion we adopted is the accuracy of dependency relation, but the last bunsetsu, which has no bunsetsu to depend on, and the second-to-last bunsetsu, which depends always on the last bunsetsu, are excluded from consideration.</Paragraph>
      <Paragraph position="2"> Table 3 shows cross entropy and parsing accuracy of the POS-based dependency model and the class-based dependency model. This result tells us our word clustering method increases parsing accuracy considerably. This is quite natural in the light of the decrease of cross entropy.</Paragraph>
      <Paragraph position="3"> The relation between the learning corpus size and cross entropy or parsing accuracy is shown in Figure 3. The lower bound of cross entropy is the entropy of Japanese, which is estimated to be 4.3033 bit (Mori and Yamaji, 1997). Taking this fact into consideration, the cross entropy of both of the models has stronger tendency to decrease. As for ac- null curacy, there also is a tendency to get more accurate as the learning corpus size increases, but it is a strong tendency for the class-based model than for the POS-based model. It follows that the class-based model profits more greatly from an increase of the learning corpus size.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML