File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/91/h91-1011_concl.xml

Size: 3,880 bytes

Last Modified: 2025-10-06 13:56:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="H91-1011">
  <Title>Modelling Context Dependency in Acoustic-Phonetic and Lexical Representations 1</Title>
  <Section position="6" start_page="74" end_page="75" type="concl">
    <SectionTitle>
DISCUSSION &amp; FUTURE PLANS
</SectionTitle>
    <Paragraph position="0"> While the experiments presented here only address local contextual effects, it is important to note that the mechanism that we have developed can account for both local contextua effects and more global contextual effects. Furthermore, th( general approach we have taken not only allows us to account for contextual effects on the phonetic models, but also to alter the structure of the pronunciation networks to account for contextual effects. Admittedly, we have only experimented with context-dependent models in these recognition experiments. Even within the limited scope of the current experiments, however, we have achieved substantial performance improvements over our baseline system. In related work, we have experimented with altering the structure of pronunciation networks, resulting in substantial performance increases on the task of recognizing a small set of isolated words over telephone network. We hope that when we extend the present experiments by altering the structure of the pronunciation networks and by considering more contextual effects, we will find further performance increases on the Resource Management task as well.</Paragraph>
    <Paragraph position="1"> In the present work we have kept the form of the input representation fixed. Since this particular transformation of the original acoustic dimensions was intended to allow us to model context-independent labels with rather simple diagonal Caussian models, it may not be an appropriate input representation for the more flexible models discussed here.</Paragraph>
    <Paragraph position="2"> In particular, since we have so far found that we can achieve the best performance by using the context-normalized input dimensions (which assumes that the normalization can  be carried out for each input dimension independently), we would now like to have input dimensions where context affects the dimensions independently. It is unlikely that the set of dimensions resulting from our current principle components analysis is the best input for this type of normalization. We are now beginning to experiment with applying the normalization to the original input dimensions, which should be more directly affected by contextual effects.</Paragraph>
    <Paragraph position="3"> We would also like to explore the use of distinctive features as the input representation since there is some evidence that this might be a better representation for accounting for contextual effects \[12\]. For example, in the environment of a nasal, we could expect the nasality feature of a vowel to be affected in a particular way whereas other features of the vowel would be affected by other contextual effects.</Paragraph>
    <Paragraph position="4"> Finally, if we account for context by making specific models for particular contexts (e.g., triphones or the context-dependent tree discussed above), we are constrained to some degree by the amount of training data we would have available to train each of these more specific models. This has led us in the past to use fairly simple and easily trained parametric distributions for these models.</Paragraph>
    <Paragraph position="5"> Accounting for context by normalizing the input dimensions reduces the need to split up the training data, and therefore should lead to more flexible and robust models for the labels in the lexicon. We have thus far presented results using mixture Gaussian models, but are now experimenting with other types of models and discriminators including multi-layer perceptrons and radial basis functions.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML