File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/91/j91-3004_abstr.xml
Size: 2,571 bytes
Last Modified: 2025-10-06 13:47:16
<?xml version="1.0" standalone="yes"?> <Paper uid="J91-3004"> <Title>Computation of the Probability of Initial Substring Generation by Stochastic Context-Free Grammars</Title> <Section position="2" start_page="0" end_page="0" type="abstr"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> The purpose of this article is to develop an algorithm for computing the probability that a stochastic context-free grammar (SCFG) (that is, a grammar whose production rules have attached to them a probability of being used) generates an arbitrary initial substring of terminals. Thus, we treat the same problem recently considered by Wright and Wrigley (1989) from the point of view of LR grammars.</Paragraph> <Paragraph position="1"> Probabilistic methods have been shown most effective in automatic speech recognition. Recognition (actually transcription) of natural unrestricted speech requires a &quot;language model&quot; that attaches probabilities to the production of all possible strings of words (Bahl et al. 1983). Consequently, if we believe that word generation can be modeled by context-free grammars, and if we want to base speech recognition (or handwriting recognition, optical character recogition, etc.) on such models, then it will become necessary to embed them into a probabilistic framework.</Paragraph> <Paragraph position="2"> In speech recognition we are presented with words one at a time, in sequence, and so we would like to calculate the probability P(s --* w~w2... Wk...) that an arbitrary string wlw2... Wk is the initial substring of a sentence generated by the given SCFG. I * P.O. Box 218, Yorktown Heights, NY 10598 1 In fact, in speech recognition (Bahl et al. 1983) we are presented with a hypothesized past text (the history) WlW2 .. * Wk and are interested in computing, for any arbitrary word v, the conditioned probability P(Wk+ 1 ---- v \[ wlw2.., wk) that the next word uttered will be v given the hypothesized past wl w2 ... Wk. Assuming that successive sentences s are independent of each other (a rather dubious assumption justifiable only by a lack of adequate understanding of how one sentence influences another), we may as well take the view that wl is the first word of the current sentence and that Wk is not the last. Then</Paragraph> <Paragraph position="4"> Hence our interest in the calculation of P(s --~ wlw2 * .. Wk...).</Paragraph> <Paragraph position="5"> (~) 1991 Association for Computational Linguistics Computational Linguistics Volume 17, Number 3</Paragraph> </Section> class="xml-element"></Paper>