File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/91/p91-1038_abstr.xml

Size: 3,751 bytes

Last Modified: 2025-10-06 13:47:16

<?xml version="1.0" standalone="yes"?>
<Paper uid="P91-1038">
  <Title>A Preference-first Language Processor Integrating the Unification Grammar and Markov Language Model for Speech Recognition-ApplicationS</Title>
  <Section position="2" start_page="0" end_page="293" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> In many speech recognition applications, a word lattice is a partially ordered set of possible word hypotheses obtained from an acoustic signal processor. The purpose of a language processor is then, for an input word lattice, to find the most promising word sequence or sentence hypothesis as the output (Hayes, 1986; Tomita, 1986; O'Shaughnessy, 1989). Conventionally either grammatical or statistcal approaches were used in such language processors. However, the high degree of ambiguity and large number of noisy word hypotheses in the word lattices usually make the search space huge and correct identification of the output sentence hypothesis difficult, and the capabilities of a language processor based on either grammatical or statistical approaches alone were very often limited. Because the features of these two approaches are basically complementary, Derouault and Merialdo (Derouault, 1986) first proposed a unified model to combine them. But in this model these two approaches were applied primarily separately, selecting the output sentence hypothesis based on the product of two probabilities independently obtained from these two approaches.</Paragraph>
    <Paragraph position="1">  In this paper a new language processor based on a recently proposed augmented chart parsing algorithm (Chien, 1990a) is presented, in which the grammatical approach of unification grammar (Sheiber, 1986) and the statistical approach of Markov language model (Jelinek, 1976) are properly integrated in a preference-first word lattice parsing algorithm. The augmented chart (Chien, 1990b) was extended from the conventional chart. It can represent a very complicated word lattice, so that the difficult word lattice parsing problem can be reduced to essentially a well-known chart parsing problem.</Paragraph>
    <Paragraph position="2"> Unification grammars, compared with other grarnmal~cal approaches, are more declarative and can better integrate syntactic and semantic information to eliminate illegal combinations; while Markov language models are in general both effective and simple. The new language processor proposed in this paper actually integrates the unification grammar and the Markov language model by a new preference-f'u-st parsing algorithm with various preference-first parsing strategies defined by different constituent construction principles and decision rules, such that the constituent selection and search directions in the parsing process can be more appropriately determined by Markovian probabilities, thus rejecting most noisy word hypotheses and significantly reducing the search space. Therefore the global structural synthesis capabilities of the unification grammar and the local relation estimation capabilities of the Markov language model are properly integrated. This makes the present language processor not sensitive at all to the increased number of noisy word hypotheses in a very large vocabulary environment. An experimental system for Mandarin speech recognition has been implemented (Lee, 1990) and tested, in which a very high correct rate of recognition (93.8%) was obtained at a very high processing speed (about 5 sec per sentence on an IBM PC/AT). This indicates significant improvements as compared to previously proposed models. The details of this new language processor will be presented in the following sections.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML