XML Viewer - p97-1055

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/p97-1055_metho.xml
Size: 7,465 bytes
Last Modified: 2025-10-06 14:14:40
<?xml version="1.0" standalone="yes"?>
<Paper uid="P97-1055">
  <Title>Paradigmatic Cascades: a Linguistically Sound Model of Pronunciation by Analogy</Title>
  <Section position="5" start_page="431" end_page="432" type="metho">
    <SectionTitle>
3 Experimental Results
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="431" end_page="431" type="sub_section">
      <SectionTitle>
3.1 Experimental Design
</SectionTitle>
      <Paragraph position="0"> We have evaluated this algorithm on two different pronunciation tasks. The first experiment consists in infering the pronunciation of the 70 pseudo-words originally used in Glushko's experiments, which have been used as a test-bed for various other pronunciation algorithms, and allow for a fair head-to-head comparison between the paradigmatic cascades model and other analogy-based procedures. For this experiment, we have used the entire nettalk (Sejnowski and Rosenberg, 1987) database (about 20 000 words) as the learning set.</Paragraph>
      <Paragraph position="1"> The second series of experiments is intended to provide a more realistic evaluation of our model ill the task of pronouncing unknown words. We have used the following experimental design: 10 pairs of disjoint (learning set, test set) are randomly selected from the nettalk database and evaluated. In each experiment, the test set contains abou~ the tenth of the available data. A transcription is judged to be correct when it matches exactly the pronuncia-tion listed in the database at the segmental level. The number of correct phonemes in a transcription is computed on the basis of the string-to-string edit distance with the target pronunciation. For each experiment, we measure the percentage of phoneme and words that are correctly predicted (referred to as correctness), and two additional figures, which are usually not significant in context of the evaluation of transcription systems. Recall that our algorithm, unlike many other pronunciation algorithms, is likely to remain silent. In order to take this aspect into account, we measure in each experiment the number of words that can not be pronounced at all (the silence), and the percentage of phonemes and words that are correctly transcribed amongst those words that have been pronounced at all (the precision). The average values for these measures are reported hereafter. null</Paragraph>
    </Section>
    <Section position="2" start_page="431" end_page="432" type="sub_section">
      <SectionTitle>
3.2 Pseudo-words
</SectionTitle>
      <Paragraph position="0"> All but one pseudo-words of Glushko's test set could be pronounced by the paradigmatic cascades algorithm, and amongst the 69 pronunciation suggested by our program, only 9 were uncorrect (that is, were not proposed by human subjects in Glushko's experiments), yielding an overall correctness of 85.7%, and a precision of 87.3%.</Paragraph>
      <Paragraph position="1"> An important property of our algortihm is that it allows to precisely identify, for each pseudo-word, the lexical entries that have been analogized, i.e.</Paragraph>
      <Paragraph position="2"> whose pronunciation was used in the inferential process. Looking at these analogs, it appears that three of our errors are grounded on very sensible analogies, and provide us with pronunciations that seem at least plausible, even if they were not suggested in Glushko's experiments. These were pild and bild, analogized with wild, and pornb, analogized with tomb.</Paragraph>
      <Paragraph position="3"> These results compare favorably well with the performances reported for other pronunciation by analogy algorithms ((Damper and Eastmond, 1996) re- null ports very similai&amp;quot; correctness figures), especially if one remembers that our results have been obtained, wilhout resorting to any kind of pre-alignment between the graphemic and phonemic strin9s in the lea'icons.</Paragraph>
    </Section>
    <Section position="3" start_page="432" end_page="432" type="sub_section">
      <SectionTitle>
3.3 Lexical Entries
</SectionTitle>
      <Paragraph position="0"> This second series of experiment is intended to provide us with more realistic evaluations of the paradigmatic cascade rnodeh Glushko's pseudo-words have been built by substituting the initial consonant or existing monosyllabic words, and consl.itute theretore an over-simplistic test-bed. The nettalk dataset contains plurisyllabic words, complex derivatives, loan words, etc, and allows to test the ability of our model to learn complex morpho-phonological phenomenas, notably vocalic alternations and other kinds of phonologically conditioned root a.llomorphy, that are very difficult to learn.</Paragraph>
      <Paragraph position="1"> With this new test set, the overall performances of our algorithm averages at about 54.5% of entirely correct words, corresponding to a 76% per phoneme correctness. If we keep the words that could not be pronounced at all (about 15% of the test set) apart fi'oln the evaluation, the per word and per phoneme precision improve considerably, reaching respectively 65% and 93%. Again, these precision results compare relatively well with the resuits achieved on the same corpus using other self-learning algorithms for grapheme-to-phoneme trmascription (e.g. (van den Bosch and Daelemans, 1993; Yvon, 1996a)), which, unlike ours, benefit from the knowledge of tile alignment between graphemic and phonemic strings. Table 3 suimnaries the performa.uce (in terms of per word correctness, silence, and precision) of various other pronunciation systems, namely PRONOUNCE (Dedina and Nusbaum, 1991), DEC (Torkolla, 1993), SMPA (Yvon, 1!)96a). All these models have been tested nsing exa.c(.ly the sanle evMual.ion procedure and data. (see (Yvon, 1996b), which also contains an evalution performed with a French database suggesting that this h'arning strategy effectively applies to other languages). null System corr. prec. silence  '\[a/)le 3 pinpoints the main weakness of our model, that is, its significant silence rate. The careful exalnination of the words that cannot be pronounced reveals that they are either loan words, which are very isolated in an English lexicon, and .for which no analog can be found; or complex morphological derivatives for which the search procedure is stopped before the existing analog(s) can be reached. Typical examples are: synergistically, timpani, hangdog, oasis, pemmican, to list just a few. This suggests that the words which were not pronounced are not randomly distributed. Instead, they mostly belong to a linguistically homogeneous group, the group of foreign words, which, for lack of better evidence, should better be left silent, or processed by another pronnnciation procedure (for example a rule-based system (Coker, Church, and Liberman, 1990)), than uncorrectly analogized.</Paragraph>
      <Paragraph position="2"> Some complementary results finally need to be mentioned here, in relation to the size of lexical neighbourhoods. In fact, one of our main goal was to define in a sensible way the concept of a lexical neighbourhood: it is therefore important to check that our model manages to keep this neighbourhood relatively small. Indeed, if this neighbourhood can be quite large (typically 50 analogs) for short words, the number of analogs used in a pronunciation averages at about 9.5, which proves that our definition of a lexical ncighbourhood is sufficiently restrictive.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML