File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/98/w98-1227_evalu.xml

Size: 2,705 bytes

Last Modified: 2025-10-06 14:00:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-1227">
  <Title>A Method of Incorporating Bigram Constraints into an LR Table and Its Effectiveness in Natural Language Processing i</Title>
  <Section position="6" start_page="14" end_page="14" type="evalu">
    <SectionTitle>
5 Evaluation of Perplexity
</SectionTitle>
    <Paragraph position="0"> Perplexity is a measure of the constraint imposed by the language model. Test-set perplexity (Jelinek, 1990) is commonly used to measure the perplexity of a language model from a test-set. Test-set perplexity for a language model L is simply the geometric mean of probabilities defined by: where</Paragraph>
    <Paragraph position="2"> Here N is the number of terminal symbols in the test set, M is the number of test sentences and P(S,) is the probability of generating i-th test sentence Si.</Paragraph>
    <Paragraph position="3"> In the case of the bigram model, P~i(Si) is:</Paragraph>
    <Paragraph position="5"> And in the case of the trigram model, P~(Si) is:</Paragraph>
    <Paragraph position="7"> Table 5 shows the test-set perplexity of pretermirials for each language model. Here the preterminal bigram models were trained on a corpus with 20663 sentences, containing 230927 preterminals. The test-set consists of 1320 sentences, which contain 13311 preterminals. The CFG used is a phrase context-free grammar used in speech recognition tasks, and the number of rules and preterminals is 777 and 407, respectively.</Paragraph>
    <Paragraph position="8"> As is evident from Table 5, the use of a bigram LR table decreases the test-set perplexity from 6.50 to 5.99. Note that in this experiment, we used the LALR table generation algorithm 2 to construct the bigram LR table. Despite the disadvantages of 2In the case of LALR tables, the sum of the probabihties of all the possible parsing trees generated by a given CFG may be less than 1 (Inui et al., 1997).</Paragraph>
    <Paragraph position="9"> mai and Tanaka 231 A Method of Incorporating Bigram Constraints LALR tables, the bigram LR table has better performance than the simple bigram language model, showing the effectiveness of a bigram LR table.</Paragraph>
    <Paragraph position="10"> On the other hand, the perplexity of the trigram language model is smaller than that of the bigram LR table. However, with regard to data sparseness, the bigram LR table is better than the trigram language model because bigram constraints are more easily acquired from a given corpus than trigram constraints.</Paragraph>
    <Paragraph position="11"> Although the experiment described above is concerned with natural language processing, our method is also applicable to speech recognition.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML