File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-1009_intro.xml
Size: 6,198 bytes
Last Modified: 2025-10-06 14:06:26
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-1009"> <Title>Evolution of a Rapidly Learned Representation for Speech</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Precocious abilities in newborn infants are frequently taken as evidence for pre-specification of the representations that support those abilities. The prespecifications of these representations is innately determined, presumably in the genome of the individual. One such ability is that of newborn infants to be universal listeners, able to discriminate speech contrasts of all languages. This is all the more remarkable since the low-pass filtered speech sounds that foetuses hear in utero vary widely between different languages.</Paragraph> <Paragraph position="1"> Eimas et al. (1971) showed that 1-4 month old infants displayed categorical perception of the syllables/ba/and/pa/. That is to say, infants carve up the phonetic space into a set of categories with sharp boundaries. Variants of a phoneme, such as /b/, are not discriminable, even though they differ acoustically by the same amount as /p/ and /b/ (although see (guhl, 1993)). More recent research has shown that the categories are universal, so that English-learning infants can discriminate non-native contrasts in Czech (Trehub, 1973), Hindi (Werker, Gilbert, Humphrey, ~ Tees, 1981), Nthlakampx (Werker & Tees, 1984a), Spanish (Aslin, Pisoni, Hennessy, ~ Percy, 1981).and Zulu (Best, McRoberts, ~ Sithole, 1988). This suggests that infants develop an initial representation of speech that is universal and largely insensitive to the particular language to which they are exposed. The ability to discriminate some non-native speech contrasts declines after the age of 10-12 months (Werker ~ Tees, 1984a).</Paragraph> <Paragraph position="2"> Such rapid learning can be defined in terms of a taxonomy developed in the field of animal behaviour. Mayr (1974) suggested that programs of development form a continuum of flexibility in their response to environmental stimulation. He distinguished between &quot;open&quot; and &quot;closed&quot; programs of development. &quot;Closed&quot; programs of development rely on environmental input to a relatively small degree, producing highly stereotyped behaviour.</Paragraph> <Paragraph position="3"> Precedents of rapidly learned &quot;closed&quot; development abound and are also termed &quot;innately guided&quot; learning e.g. imprinting in geese and ducks and song acquisition in birds (Marler, 1991). &quot;Open&quot; programs, on the other hand, are responsive to a much broader range of stimulation and can produce a broader range of responses. The presence of one type of developmental program does not preclude the exis-Nakisa ~ Plunkett 70 Evolution of Speech Representations Ramin Charles Nakisa and Kim Plunkett (1997) Evolution of a Rapidly Learned Representation for Speech. In T.M. Ellison (ed.) CoNLL977 Computational Natural Language Learning, ACL pp 70-79. (~) 1997 Association for Computational Linguistics tence of the other, however. Just because a duckling has imprinted on its mother does not mean that it is unable to learn to recognise new objects later in life.</Paragraph> <Paragraph position="4"> Similarly, the rapid learning of speech sounds by infants does not preclude later tuning of the speech representation. In fact, we would argue that it aids such development by ensuring that later language-specific fine-tuning of the representation does not encounter local minima, which would be catastrophic for linguistic development.</Paragraph> <Paragraph position="5"> To quote from a recent review (Jusczyk, 1992): Jusczyk and Bertoncini (1988) proposed that the development of speech perception be viewed as an innately guided learning process wherei,n the infant is primed in certain ways to seek out some type of signals as opposed to others. The innate prewiring underlying the infant's speech perception abilities allows for development to occur in one of several directions. The nature of the input helps to select the direction that development will take. Thus, learning the sound properties of the native language takes place rapidly because the system is innately structured to be sensitive to correlations of certain distributional properties and not others.</Paragraph> <Paragraph position="6"> In order to make explicit what is meant by &quot;innately guided learning&quot; and &quot;innate prewiring&quot; we have developed a connectionist model of innately guided learning. The approach taken has been to encode an artificial neural network (ANN) in a genome which stores its architecture and learning rules. The genomic space of possible ANNs is searched for networks that are well suited to the task of rapidly learning to detect contrastive features of human speech sounds using unsupervised learning. Importantly, networks start life with a completely randomized set of connections and therefore have no representational knowledge about speech at the level of individual connections. The network must therefore use its architecture and learning rules in combination with auditory input to rapidly converge on a representation.</Paragraph> <Paragraph position="7"> The model attempts to explain how innate constraints on a neural network could allow infants to be sensitive to a wide range of features so soon after birth, and to develop the same initial features whatever their target language. It also exhibits other features typically associated with human speech perception, namely categorical perception and patterns of phoneme confusability similar to that of humans.</Paragraph> <Paragraph position="8"> The model does not account directly for the much slower, roughly year-long process by which some featural distinctions are lost. It is possible that features are never lost and that units which represent information that is redundant in the target language are ignored by higher level processing, as suggested by Werker and Tees (Werker 8z Tees, 1984b).</Paragraph> </Section> class="xml-element"></Paper>