File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/05/h05-1026_abstr.xml

Size: 1,472 bytes

Last Modified: 2025-10-06 13:44:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="H05-1026">
  <Title>Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 201-208, Vancouver, October 2005. c(c)2005 Association for Computational Linguistics Training Neural Network Language Models On Very Large Corpora [?]</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> During the last years there has been growing interest in using neural networks for language modeling. In contrast to the well known back-off n-gram language models, the neural network approach attempts to overcome the data sparseness problem by performing the estimation in a continuous space. This type of language model was mostly used for tasks for which only a very limited amount of in-domain training data is available.</Paragraph>
    <Paragraph position="1"> In this paper we present new algorithms to train a neural network language model on very large text corpora. This makes possible the use of the approach in domains where several hundreds of millions words of texts are available. The neural network language model is evaluated in a state-of-the-art real-time continuous speech recognizer for French Broadcast News. Word error reductions of 0.5% absolute are reported using only a very limited amount of additional processing time.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML