File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/88/c88-1035_intro.xml

Size: 7,479 bytes

Last Modified: 2025-10-06 14:04:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="C88-1035">
  <Title>Lexical Functional Grammar in Speech</Title>
  <Section position="2" start_page="0" end_page="173" type="intro">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> The most important problem in all speech recognition systems is the inherent uncertainty associated with the acoustic-phonetic decoding process at the basis of such a system. One approach taken in many existing system to overcome these difficulties is to integrate higher level knowledge sources that have a certain a-priori knowledge about specific problem areas. Following this line of thought, the system architecture adopted in the IKAROS-project assumes different levels of knowledge (representations) e.g. acoustic parameters, phonemes, words, constituent structures etc. The interaction between these knowledge sources is controlled by a central blackboard control module (like in HEARSAY II).</Paragraph>
    <Paragraph position="1"> This whole system is embedded in an object-oriented environment and communication between the modules is realized by message passing.</Paragraph>
    <Paragraph position="2"> Within IKAROS particular attention is given to the problem of using the same knowledge representations both for data-driven bottom-up hypothesizing and expectation-driven top-down prediction and to the problem of providing a general framework of uncertainty management.</Paragraph>
    <Paragraph position="3"> According to this rationale, the main purpose of the syntax component is to constrain the number of word sequences to be dealt with in the recognition process and to predict or insert poorly recognized words. Grammaticaless in itself is of no importance to us. Quite to the contrary, in a real-live application a certain degree of error tolerance is a  In the syntax component of IKAROS we work within the formal framework of a probabilistic Lexical Functional Grammar. Certain modifications to the formalism as expounded in /Bresnan1982/ have been made to suit our purposes. We use as an implementation an event-driven chart-parser that  is capable of all the necessary parsing strategies i.e. top-down, bottom-up and left-to-right and right-to-left parsing.</Paragraph>
    <Paragraph position="4"> 2. Probabilistic context.free Grammars 2.1. The event-driven parser  The interaction between the blackboard manager and the syntax component is roughly as follows: the blackboard manager sends a message to the syntax component indicating that a particular word has been recognized (or rather &amp;quot;hypothesized&amp;quot;) at a certain position in the input stream (or in charto parser terminology with starting and ending verte~ ~ number) together with a certain numerical confidence score. The syntax component accumulates information about these (in arbitrary order) incoming word hypotheses and in turn posts hypotheses about predicted and recognized words or constituents on the blackboard. The job of the syntax component now is to decide between several conflicting (or competing) constituent structures stored in the chart i.e. to choose the best grammatical structure.</Paragraph>
    <Paragraph position="5"> 2.2. The formalism We assume a probabilistic context-free grammar G=&lt;VN, VT, R,S&gt;: VN denotes the nonterminal vocabulary Nonterminals are denoted by A, B, C ....</Paragraph>
    <Paragraph position="6"> strings of these by X, Y, Z...</Paragraph>
    <Paragraph position="7"> lexical categories by P, Q ....</Paragraph>
    <Paragraph position="8"> VT denotes the terminal vocabulary terminals (words) denoted by a, b, c .....</Paragraph>
    <Paragraph position="9"> strings of both types of symbols are denoted by w, x, y, z .</Paragraph>
    <Paragraph position="10"> R denotes the set of rules {R1, R2 ..... Ri} with each rule having the format</Paragraph>
    <Paragraph position="12"> where qi indicates the a-priori  Z p( xi a Q Yi &lt;-Ti- S ) probability for the application of this i rule ,,c.~ ,~. ~., n,, ~= ............... S denotes the initial symbol Lexical rtdes have the format</Paragraph>
    <Paragraph position="14"> In a probabilistic grammar, there is no clearcut dichotomy between grammatical and ungrammatical sentences. Rather, we can devise our langt~age model in such a way that more frequent phrases receive a higher probability than less frequ,mt ones. Even different word orders will have different probabilities.</Paragraph>
    <Paragraph position="15"> Now we are able to compute the a-priori p r o b a b i 1 i t y of a (partial) derivation T starting with the symbol S in the following recursive</Paragraph>
    <Paragraph position="17"> p(xYz &lt;-T- S)= p(xAy&lt;-S)*q , if there is a rule &lt; A -&gt; Y, q&gt; in R In our implementation, these a-priori probabilities are weighted with the scores delivered for individual words by the acoustic-phonetic componem to yield accumulated grammaticalacoustic scores for whole phrases. Quite the opposite problem arises in the analysis context when we ask for the (relative) probability of a given string y being derived by a particular derivation Tk (when there may be several different derivation histories Ti for the same string).</Paragraph>
    <Paragraph position="18"> We may comPute the a-posteriori derivation probability of a string y by using Bayes&amp;quot; Theorem</Paragraph>
    <Paragraph position="20"> As a specialization, this formula is of particular interest if we want to predict e.g. words or categories following or preceding a already recognized word etc. (This is useful for &amp;quot;island parsing&amp;quot; when only the most promising parses should be continued.) Consequently, the a-posteriori probability that the lexical category Q immediately follows the word &amp;quot;a&amp;quot; can be calculated as p(S &lt;- xaQy ): p( wj a Pj zj &lt;-Tj- S ) J All derivations appearing on the right side are minimal derivations for the substring &amp;quot;aQ&amp;quot; or &amp;quot;aPj&amp;quot; and the Pj's range ow~r all lexical categories in G (In the formula, of course, we assume p(waPz &lt;-- S) = 0 if the substring &amp;quot;alP&amp;quot; isn't derivable in G). This formula reflects the common probabilistic assumption that the derivation probability of a substring is the sum of all distinct alternative derivation probabilities of this string (if there is more than one possibility).</Paragraph>
    <Paragraph position="21"> 2.3. Example Grammar G1 The following toy grammar is designed to demonstrate the formalism. That it generates many unwanted sentences need not concern us here.</Paragraph>
    <Paragraph position="22"> Our grammar has the following rules  Let us assume the word &amp;quot;board&amp;quot; has been recognized somewhere in the input stream (but not at its end). We obtain the following a-priori probabilities for minimal derivations involving &amp;quot;board&amp;quot; with a subsequent lexical category</Paragraph>
    <Paragraph position="24"> Actually, there are no more minimal derivations of the desired type. We may now calculate the a-posteriori probability of V following the word</Paragraph>
    <Paragraph position="26"> The a-posteriori probability of the other (&amp;quot;conflicting&amp;quot;) possibility i.e. that a Q follows the word ',board&amp;quot; is p(# x board Q y # &lt;- S)= 1 - 0.32 = 0.68 In our implementation these a-posteriori probabilities can easily be computed from the derivation probabilities attached to the active edges in the chart parser.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML