File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/92/h92-1093_evalu.xml

Size: 2,785 bytes

Last Modified: 2025-10-06 14:00:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1093">
  <Title>Weight Estimation for N-Best Rescoring*</Title>
  <Section position="4" start_page="0" end_page="455" type="evalu">
    <SectionTitle>
3. EXPERIMENTS
</SectionTitle>
    <Paragraph position="0"> Experiments were conducted to gain a better understanding of the weight space. In our implementation of the N-Best rescoring paradigm \[1\], the N-Best list (N = 20) is generated by the BBN BYBLOS system \[3\]. This list is rescored by the BU system, which is based on the stochastic segment model (SSM) \[4, 5\], a statistical model for the sequence of observations that comprise a phoneme segment. The SSM models are based on independent-frame assumptions, are gender-dependent and are context-dependent with context tying based on automatic clustering. Results are reported on the speaker-independent Resource Management corpus using the Word-Pair grammar. The weights were trained on the Feb 89 test set and then used to combine scores for the Oct 89 test set. The training of weights may be either gender-dependent or gender-independent.</Paragraph>
    <Paragraph position="1"> Figure 1 and Figure 2 show contour plots for the word error distribution as a function of normalized HMM and SSM scores on the two test sets, keeping the phoneme and word insertion penalties fixed at typical values. The contours have been drawn for the ten lowest word errors, with intensity being inversely proportional to error. The HMM and SSM scores were normalized by the average of the respective scores for the correct sentences to better illustrate their relative weight in the combined score.</Paragraph>
    <Paragraph position="2"> Figure 1 represents the case for gender-dependent op- null the two test sets appears vastly different. The effects of gender-independent optimization is shown in Figure 2. Though the Oct 89 figure has fewer local optima, it must be noted that the best region for one test set still does not match that of the other. Normalizing the acoustic scores shows that the HMM is weighted higher than the SSM, but the weights are of the same order of magnitude. The word vs. phoneme count contours (not shown) suggest that typical values of the word penalty are about 3-5 times that of the phoneme penalty.</Paragraph>
    <Paragraph position="3"> Our current word recognition results on the Feb 89 test set are 4.2% for SSM and 2.8% for the combined system (HMM-SSM) using weights estimated on this test set.</Paragraph>
    <Paragraph position="4"> Using the same weights and testing on the Oct 89 test set, the results are 4.8% for the SSM and 3.3% for the combined system. Combining the SSM with the BBN HMM yields a 13% reduction in error rate over the HMM performance alone which was 3.8%.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML