File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/93/h93-1009_evalu.xml

Size: 1,550 bytes

Last Modified: 2025-10-06 14:00:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="H93-1009">
  <Title>A * Bilingual VOYAGER System 1</Title>
  <Section position="6" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
EVALUATION
</SectionTitle>
    <Paragraph position="0"> For the Japanese VOYAGER system, we defined a vocabulary of 495 words comprised of words in the training set and words determined by translating 2000 sentences from the English VOYAGER training corpus. This vocabulary covered 99% of the words in the test set (96% of unique words). The category bigram was also trained using the training data and had perplexities of 25.9 and 27.5 on the training and test sets respectively. First choice word and sentence error rates were 14.9% and 53.3%, respectively, on the test set.</Paragraph>
    <Paragraph position="1"> The parser covers 82% percent of the training data, and 65% of the test data. An inspection of the answers generated by the system using text input showed that 60% of the responses for the test set was correct. The performance of the system dropped by 8%, to 52%, when the input is spoken rather than typed (N = I0 for the N-best interface). Note that the system's understanding ability actually exceeds its sentence recognition accuracy by 5%, which suggests that a full transcription is not always necessary for understanding. Finally, this performance is similar to that initially reported for our English system when using context-independent phone models with a word-pair grammar of similar perplexity (22) \[i\].</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML