File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/w97-1103_metho.xml

Size: 8,844 bytes

Last Modified: 2025-10-06 14:14:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1103">
  <Title>Self Organisation in Vowel Systems through Imitation</Title>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 The System
</SectionTitle>
    <Paragraph position="0"> The agents in the simulation are equipped with a speech synthesiser, a speech perception system and a list of phonemes. It should be stressed that the agents are not restricted to any particular natural language. The speech synthesiser is capable of generating all simple vowels. It takes as input the three major vowel features: tongue position, tongue height and lip rounding. Its output consists of the first four formant frequencies of the vowel that would be generated by the specified articulator positions. The production model is based on an interpolation of artificially generated formant patterns of 18 different vowels taken from (Vall~e, 1994, page 162-164). A certain amount of noise is added to the formant frequencies: they are shifted up or down a random percentage. The speech perception system is based on a model developed by (Bo~ et al, 1995) who based their system on a substantial amount of observations of human perception of speech. In this model low frequency formants are considered to be more salient than high frequency formants and if two formants are close together, they are perceived approximately as one formant with an intermediate frequency. These characteristics ensure that the agents perceive formant patterns as similar if humans would also perceive them as similar. Both the speech synthesiser and the speech perception system are described in more detail in (de Boer, 1997) The agents start with an empty phoneme list: they know no phonemes at all. They learn their phonemes through interactions with each other.</Paragraph>
    <Paragraph position="1"> The shape of the resulting vowel system will be determined for a small part by coincidence and for the largest part by self-organisation under acoustical and articulatory constraints.</Paragraph>
    <Paragraph position="2"> The interactions between the robots are called imitation games. For each imitation game, two agents are chosen randomly from the population. One agent will initiate the game and is called the initiator, the other one is called the imitator. The initiator randomly chooses a phoneme from its phoneme list, or creates a new phoneme randomly if its phoneme list is empty. It then generates the corresponding sound (the formant pattern). The imitator listens to this sound, and analyses it in terms of its own phonemes. It tries to find among its own phonemes the phoneme whose formant pattern most closely resembles the sound it just heard. If its phoneme list is empty, it generates a new phoneme. The imitator then generates the sound that corresponds to its best matching phoneme. The initiator listens to this sound and also analyses it in terms of its own phonemes.</Paragraph>
    <Paragraph position="3"> It then checks whether the phoneme that most closely matches the sound it just heard is the same as the phoneme it originally said. If they are the same, the imitation game is successful.</Paragraph>
    <Paragraph position="4"> If they are not the same, the game is unsuccessful. null Depending on the outcome of the language game, the imitator undertakes a number of actions. If the language game was successful, it shifts the phoneme it said in such a way that it will sound more like the sound it just heard.</Paragraph>
    <Paragraph position="5"> This is done by making slight changes to the phoneme and by checking whether these increase the resemblance. The change that most increases the resemblance is kept. This procedure is called hill climbing in artificial intelligence, and it is comparable to making sounds to oneself in order to learn how to pronounce a given sound.</Paragraph>
    <Paragraph position="6"> If the imitation game was unsuccessful, the agent can either create a new phoneme or shift the old phoneme, depending on whether the phoneme it used for imitating the sound had previously been successful or not. The success of a phoneme is calculated by keeping track of the number of times a phoneme was used in an imitation game (both by initiator and by imitator) and the number of times the imitation game in which the phoneme was used was successful.</Paragraph>
    <Paragraph position="7"> The ratio between these numbers is used as a measure of success of the phoneme.</Paragraph>
    <Paragraph position="8"> If the phoneme has been unsuccessful, it is shifted to resemble more closely the sound that was heard. If it has been successful, however, it is assumed that the failure of the imitation game was caused by the fact that two phonemes are confused. The initiator has two phonemes that are matched by only one phoneme in the imitator. Hence the imitator creates a new phoneme that closely resembles the sound that was heard.</Paragraph>
    <Paragraph position="9"> This usually resolves the confusion.</Paragraph>
    <Paragraph position="10"> Two more processes are taking place in the agents. First of all, an agent's phonemes that resemble each other too closely are merged.</Paragraph>
    <Paragraph position="11"> Two phonemes are merged by keeping the most successful one and by throwing away the least successful one. The successfulness of the new phoneme is calculated by adding the use- and success counts of the original phonemes. Secondly, phonemes that have a use/success ratio that is too low, are discarded. This causes bad phonemes to disappear eventually from the phoneme repertoire of the agents.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 The Experiments
</SectionTitle>
    <Paragraph position="0"> A large number of experiments have been done with the system described above. Experiments have been performed with varying numbers of agents and under various conditions of noise. The system consistently produced populations of agents that were able to imitate each other successfully with the vowel systems that emerged. These vowel systems showed remarkable similarities with vowel systems found in human languages. A typical example of the vowel systems of a population of 20 agents, with  systems of a population of twenty agents after 2000 imitation games.</Paragraph>
    <Paragraph position="1"> a maximum of 10% noise on the formant frequencies, is given in figure 1. This figure is an acoustic representation of all the phonemes of all the agents in the population. In this figure a number Of clear clusters can be discerned. Almost all the phonemes of the agents tend to appear in one of the seven clusters. In addition, almost all agents have a phoneme in the six largest clusters. Only in the small cluster in the lower left ,corner, representing the \[u\], few agents have a phoneme. This is probably because this phoneme has recently been created, and not all agents have been able to make an imitation, yet.</Paragraph>
    <Paragraph position="2"> The vowel systems that emerge from the imitation games! are not static. They are constantly changi~ng as new phonemes are formed and old phonemes shift through the available acoustic spac e . This process is illustrated in figure 3, the :result from a different simulation with the ~ame starting conditions (twenty agents and 10% noise) but with slightly different random influences. In this figure we see two vowel systems: that are snapshots of one population of agents, taken 1000 language games apart. We see that clusters move through the acoustic space' and that clusters tend to compact. However, a certain distance appears to be kept between the clusters. Also the clusters seem to remain spread over a certain area; they  twenty agents that communicate with 20% noise. The three vowel system has been stable for over 1000 language games.</Paragraph>
    <Paragraph position="3"> do not reduce to points completely.</Paragraph>
    <Paragraph position="4"> Under various conditions of noise, systems with different numbers of clusters emerge. If the amount of noise is increased, systems with fewer clusters are generated (an example is given in figure 2). However, the success of the imitation games stays approximately the same.</Paragraph>
    <Paragraph position="5"> Also the number of agents does not seem to matter much. Experiments with five to forty agents have all resulted in stable systems. Furthermore, the systems seem to be resistant to population change. If old agents are removed at random, and new empty agents are added at random, the vowel systems remain stable.</Paragraph>
    <Paragraph position="6"> The empty agents will rapidly learn the existing phonemes by imitating more experienced agents. If the inflow of new agents becomes too large, however, instability arises.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML