XML Viewer - j97-4001

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/j97-4001_metho.xml
Size: 58,748 bytes
Last Modified: 2025-10-06 14:14:30
<?xml version="1.0" standalone="yes"?>
<Paper uid="J97-4001">
  <Title>Algorithms for Grapheme-Phoneme Translation for English and French: Applications for Database Searches and Speech Synthesis</Title>
  <Section position="2" start_page="496" end_page="500" type="metho">
    <SectionTitle>
2. Dictionary versus Letter-to-Sound Rules
</SectionTitle>
    <Paragraph position="0"> Any procedure to convert text into phonemes would necessarily make use of a lexical database or dictionary to provide for lookup of words prior to letter-to-sound conversion. Such a database typically consists of words that exhibit unusual stress patterns (for languages such as English), and of unassimilated or partially assimilated loan-words including place names and personal names that do not fit into the canonical phonological or phonotactic form of the language.</Paragraph>
    <Paragraph position="1"> Memory is increasingly less expensive and we now have the capability to store in memory a large number of words (along with their phonetic equivalent, grammatical class, and meaning). Why not then store all words (or certainly all of the words that would be commonly encountered in text) in memory? First, if we include derived forms and technical jargon, there are well over three-quarters of a million words in the English or French language. It would be an extremely difficult task to create such a list. More importantly, new words come into the language every day and from these are generated many derived forms. Lastly, when we factor in items that may not even be found in a dictionary, such as proper nouns (first names, surnames, place names, names of corporations, etc.), the necessity of a rule-governed approach quickly becomes apparent. For example, there are roughly 1.5 million different surnames in the US alone (Spiegel 1985; Spiegel and Machi 1990; Vitale 1991); moreover, one-third of these surnames are unique in that they are singletons. In fact, at this stage in the technology, it is still the rule set and not the dictionary that is the more dominant, although this is beginning to change, primarily due to the need for more complex lexical entry containing information on syntax, semantics, and even pragmatics for more natural prosodics in text-to-speech tasks.</Paragraph>
    <Paragraph position="2"> It is difficult and time consuming to place all derived forms in the dictionary, including singular and plural forms and all verb affixes, especially for a language like French where a verb can expand, depending on the conjugation, into about fifty strings consisting of the root plus suffixes. Code could, of course, be added in the dictionary modules providing information on how to form the plurals or conjugations.</Paragraph>
    <Paragraph position="3"> The lookup procedure could then strip some of the affixes to retrieve the root in the dictionary.</Paragraph>
    <Paragraph position="4"> There do exist letter-to-sound systems based on very large dictionaries (for French, see Laporte \[1988\]) but they require a great deal of memory, especially if the lexical entries contain graphemic, phonetic, syntactic, and semantic information. The main advantage is that this dictionary can then be used to drive a sentence tagger and parser necessary for improving intonation and naturalness for speech synthesis. This universal electronic dictionary could also be used for speech recognition and machine translation. Today, most speech synthesizers do not include such a large dictionary, which, in any case, must be complemented by a set of rules just in case the word or the proper name is not in the dictionary.</Paragraph>
    <Paragraph position="5">  3. Grapheme-to-Phoneme Conversion Problems for Both English and French  In this section, we describe the problems encountered when converting from graphemes to phonemes for English and French. Some problems are similar in both languages, others are specific to one language or the other.</Paragraph>
    <Paragraph position="6">  Computational Linguistics Volume 23, Number 4 each of which retains its pronunciation. Usually, in French, s between two vowels is pronounced \[z\], otherwise \[s\]. The s in tournesol, entresol, tOlOsi~ge, vraisemblable, contresens, antisocial must be considered the beginning of a morpheme, and although it occurs between two vowels, is pronounced \[s\]. This morpheme decomposition is difficult and is sometimes based on a large dictionary of morphs. Some implementations have had as many as 12,000 for English (Allen et al., 1979). For English, and French, the number of words having this problem is relatively small, and can be dealt with by a dictionary or rules. In the English implementation, for example, many such morphemes can be incorporated directly into the letter-to-sound rule set itself. For certain other languages, such as German, where word compounding is quite common, morpheme decomposition algorithms tend to be much more complex.</Paragraph>
    <Section position="1" start_page="497" end_page="497" type="sub_section">
      <SectionTitle>
3.4 Homographs
</SectionTitle>
      <Paragraph position="0"> Homographs are pairs of words that are orthographically identical but phonetically different. In English, this difference is often simply a difference in stress depending on the grammatical category of the word: permit (\['p3&amp;quot;mIt\] noun vs. \[po'mIt\] verb), baton (\['b~et~n\] noun vs. \[bo't~n\] verb), arithmetic (\[o'rI0m~tIk\] noun vs. \[a~rI0'metIk\] adjective) and so on. However, it can also be a difference of one or more segments: deliberate (\[dI'llborIt\] adjective vs. \[dI'llboreIt\] verb), use (\[juls\] noun vs. \[ju'z\] verb) differ in terms of only one segment. Further, it is not always possible to resolve the ambiguity from part of speech: in I read books, the pronunciation of read (\[ri'd\] or \[red\]) is ambiguous. A less-frequently examined category, but one that is crucial to more natural speech synthesis, is what we will refer to as functor homographs. These are more subtle variations found in pairs such as can, which could be a verb (\[k0en\]) or a model auxiliary (\[kN\] or \[kin\] - \[k0en\]); just, which could be an adjective (\[d3Ast\]) or an adverb (\[d3ist \]- \[d3Ast\]) , etc., where there is partial overlapping in careful speech.</Paragraph>
      <Paragraph position="1"> See Yarowsky (1994) on homograph disambiguation.</Paragraph>
      <Paragraph position="2"> In French, the situation is similar. The same spelling can produce different phonemic forms: ills (\[fis\] 'son' vs. \[fil\] 'thread'); pr6sident (\[prezida\] 'president' vs. \[prezid\] 'they preside'), etc. The pronunciation typically depends on the grammatical category of the word: tier ('proud' or 'to trust'), est ('is' or 'East'), couvent ('convent' or 'they brood'), notions ('we were noting' or 'the notions'), as ('an ace' or 'you have'), are all ambiguous in terms of their pronunciation. The word six can be pronounced \[sis\] (j'en veux six), \[siz\] (six enfants), \[si\] (sixfi'lles). First-order context can sometimes solve the problem (nous notions vs. des notions; un as vs. tu as), but, generally, a parsing of the entire sentence is required. The ambiguity is often between a conjugated verb and another grammatical category. The entire sentence can be ambiguous as in &amp;quot;les ills sont jolis&amp;quot; where ills is pronounced differently depending on the meaning (sons or threads).</Paragraph>
    </Section>
    <Section position="2" start_page="497" end_page="500" type="sub_section">
      <SectionTitle>
3.5 Stress
</SectionTitle>
      <Paragraph position="0"> For English, due to the interaction of stress and vowel reduction, knowing the stressed syllable is often crucial in determining the correct phoneme sequence (Halle and Keyser 1971). For instance, a word like aggravation has three tokens of the vowel grapheme a, but all are phonetically different. The vowel nucleus of the first syllable is \[0e\]; the stressed syllable va is manifested by \[eI\]; and vowel nucleus of the unstressed syllable gra (in this case) undergoes automatic vowel reduction and is realized as \[o\]. The stress pattern for English is difficult to predict and has to be learned.</Paragraph>
      <Paragraph position="1"> Nevertheless, some basic rules exist. We have seen the verb/noun homographs in the previous section. In words of two syllables, the verb has stress on the second syllable, the noun on the first.</Paragraph>
      <Paragraph position="2">  Computational Linguistics Volume 23, Number 4 such as le 'the' and de 'of'. If the last syllable is a consonant cluster ending in e, and the next word begins with a consonant, a short \[3\] is heard as in les ch~vres de \[18 ysvr3d3\] 'the goats of'; otherwise, more than two consonants would be in the same consonant cluster, and this presents articulatory difficulty in French and violates the constraints on syllable structure. Elision can be done in the first syllable of a word, but is considered familiar (vs. normal) style: petit \[pti\] 'small', recommencer \[rk3mase\] 'to begin again'.</Paragraph>
      <Paragraph position="3"> In the middle of a word, elision is done for words such as tellement \[t~lm~\] 'so much' but not for justement \[3yst3ma\] 'precisely', which is additional support for the three-consonant cluster (CCC) constraint. 6 This elision sometimes does not occur, as in poetry reading, for example.</Paragraph>
      <Paragraph position="4"> The rule does not provide for words like batelier \[bat31je\] 'boatman', or bachelier \[bay31je\] 'bachelor' where elision is not done. The semivowel \[j\] can be considered a consonant, and the three-consonant cluster constraint applies.</Paragraph>
      <Paragraph position="5"> Sometimes, an \[3\] phoneme is added between two words. For instance, in the newspaper name Ouest-France \[wSst#fr~s\], an epenthetic \[3\] vowel is often inserted \[w~st3 #fras\]. This happens between two words, in the context CC#C, and is again the result of the difficulty of pronouncing more than two consecutive consonants.</Paragraph>
    </Section>
    <Section position="3" start_page="500" end_page="500" type="sub_section">
      <SectionTitle>
3.8 Segmental Phonology and Speech Rate
</SectionTitle>
      <Paragraph position="0"> These rules are generic rules and sometimes may not apply in unusual cases, such as in very slow speech where each word is pronounced or in poetry (which often has its own set of rules different from normal speech). Thus far, in the area of speech synthesis, at least, not much has been done to modify segmental phonology according to speech rate.</Paragraph>
      <Paragraph position="1"> In English, when the speech rate exceeds a certain threshold, in natural speech, pauses disappear and segmental durations become shortened. In the future, in text-to-speech systems, some segments and even syllables will disappear entirely and certain functors will be greatly attenuated. See Dirksen and Coleman (1994) for more on speech rate.</Paragraph>
      <Paragraph position="2"> In French, in words containing a semivowel followed by a vowel, if the speech rate is slow enough (or sometimes in poetic contexts), a semivowel could be produced as a vowel: lui 'him' (\[lqi\] vs. \[lyi\]), nuage 'cloud' (\[nqa3\] vs. \[nya3\] ), lier 'to bind' (\[lje\] vs. \[he\]). A common phrase such as parce que 'because', which is typically two syllables in normal speech (\[parsko\]) becomes three syllables in very slow or emphatic speech (\[parsoko\]). In fast speech, the phrase je te le dirai \[3otolodir~\] 'I will tell you' is pronounced je t'le dirai \[33tlodir~\] or j'tel dirai \[3t31dir~\] eliding one or two \[3\].</Paragraph>
    </Section>
    <Section position="4" start_page="500" end_page="500" type="sub_section">
      <SectionTitle>
3.9 Proper Names
</SectionTitle>
      <Paragraph position="0"> For proper names, the correspondence between written names and their pronunciation is even more difficult to specify due to their disparate origins. In English (whether British or American), there are many different ethnic groups represented in a telephone book or database of names. In a typical American telephone book, for example, are names that originate from hundreds of languages. In France, when a person is asked to provide a proper name, he or she is also often asked to spell it. For cities like Caen (\[ka\]), Rennes (\[rCn\]), Reims (\[r~s\]), etc., the pronunciation differs substantially from the spelling. In proper names like Lesage, Desprds, Bourgneuf, Montrouge, Lesventes, it is important to recognize the morphemes Le, Des, Bourg, Mont to correctly transcribe. In</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="500" end_page="502" type="metho">
    <SectionTitle>
6 In French, la r~gle des 3 consonnes.
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="502" end_page="502" type="sub_section">
      <SectionTitle>
Divay and Vitale Grapheme-Phoneme Translation
</SectionTitle>
      <Paragraph position="0"> both anglophone and francophone countries, these patterns of immigration have been sufficient to make this a serious problem for any automatic phoneticization algorithm.</Paragraph>
      <Paragraph position="1"> The rules for proper names can generally be derived from the rules for words.</Paragraph>
      <Paragraph position="2"> Nevertheless, a large superset of rules has to be added to obtain very high accuracy since the phonotactics change from language to language. Moreover, to compound the problem, the pronunciation of proper names outside of the foreign speech community is often different from their original pronunciation. For example, in the US, e ending Italian names (pronounced \[e\] in Italian) is typically pronounced \[il\] or even \[ \] (not pronounced). The proper name Falcone is pronounced in anglophone countries as either \[f~elk~)ni'\] or even \[f~ellcDn\], Bach as either \[bax\] or \[bak\]. In French, we observe a similar situation where the name Smith is pronounced \[smis\] and Thatcher as \[sat~or\] as French does not have a \[0\] phoneme.</Paragraph>
      <Paragraph position="3"> There have been successful attempts to automatically detect the ethnic group of a proper name for use in anglophone countries like the United States, and to apply a different set of rules depending on that group (Church 1985, Vitale 1991). Trigram frequencies are computed from a large set of proper names whose ethnic group is known, and used to classify a new proper name in terms of some language, language group, or language family (the linguistic etymology of the name). Depending on that classification, different subsets of language-specific rules can be activated.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="502" end_page="502" type="metho">
    <SectionTitle>
4. Expert Systems
</SectionTitle>
    <Paragraph position="0"> Expert systems are used to facilitate the transfer of the knowledge of a specific domain from an expert to a computer. They traditionally distinguish between the system, which is as independent as possible from the application, and the expert rules, which are application dependent. The system requires a computer specialist, the rules require an expert in the domain to be processed, in this case, a linguist. Everybody is an &amp;quot;expert&amp;quot; in reading his or her own language, and the average educated individual does not hesitate in front of a word like monsieur or second in French, or hiccough or Edinburgh in English, even though the pronunciation may be quite different from the spelling. In any case, we apply, albeit unconsciously, rules to read text aloud.</Paragraph>
    <Paragraph position="1"> Considering the complexity of the problems presented above, it was quickly understood that letter-to-sound rules had to be treated like an expert system with a rule set developed by an expert (a linguist) and an interpreter to interpret the rules. This is a pragmatic approach based on failures of systems that use hard-coded rules that the linguist would be forced to program or the programmer would be forced to articulate.</Paragraph>
  </Section>
  <Section position="5" start_page="502" end_page="508" type="metho">
    <SectionTitle>
5. The English Rule Set
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="502" end_page="504" type="sub_section">
      <SectionTitle>
5.1 The Rule Formalism for English
</SectionTitle>
      <Paragraph position="0"> Essentially, a letter-to-sound rule can be viewed as similar to a phonological rule in classical phonology except that it converts a grapheme string to a phoneme string.</Paragraph>
      <Paragraph position="1"> These rules may be context-sensitive or context-free. A lexical entry in a dictionary (without syntactic and semantic information) is, in essence, a context-free letter-to-sound rule.</Paragraph>
      <Paragraph position="2"> An efficient rule set had to be developed. This rule set had to be: * rigorous (have a minimum of ordering constraints, such that new rules could be added at random with a minimum of liability); * complete, with a large number of rules covering large sequences  Computational Linguistics Volume 23, Number 4 including morphs both free and bound; optimally parsed in order to make use of morphological information relevant to allophonic variation as well as to stress.</Paragraph>
      <Paragraph position="3"> Using these criteria as a working basis, we developed a set of highly accurate letter-to-sound rules.</Paragraph>
      <Paragraph position="4"> In English, the scan is done right to left to strip the suffixes of a word in sequence as shown in Example 4 below. The input is a string of graphemes, the output a string of phonemes (and occasionally the allophones themselves). There is only one scan. The rules themselves are stated in terms a linguist would be familiar with such as the following:</Paragraph>
      <Paragraph position="6"> where X, W, and Z are grapheme sequences and \[y\] is a phoneme (or phone) sequence.</Paragraph>
      <Paragraph position="7"> A two-tiered architecture (compiler and interpreter) has been designed to easily define and modify the rule set in our implementation of grapheme-to-phoneme rules.</Paragraph>
      <Paragraph position="8"> The rule compiler transforms the external form of the rules into an internal form that can be easily used by the rule interpreter. The grapheme pattern is encoded as a simple text string. The left and right context patterns are encoded as strings of operators and parameters for a pattern-matching procedure, and the replacement phoneme string is encoded using the system's internal phoneme codes. 7 The grapheme pattern and the left context pattern are reversed by the rule compiler (that is, stored in right-to-left order) so that they are stored in the direction that they are actually used.</Paragraph>
      <Paragraph position="9"> The rule compiler does not perform any sophisticated checking of the rules; it does not check that the rule set is complete, nor does it check that long rules are always presented before short rules.</Paragraph>
      <Paragraph position="10"> The rule interpreter begins processing a word by setting its current position to the rightmost grapheme. It then searches linearly through the rules, in the order they were written, until it finds a rule that matches at that current position. A rule matches if the grapheme string matches, the left context pattern matches (if present), and the right context string matches (if present). The grapheme string is matched using a simple right-to-left text compare, and the context strings are matched by a recursive procedure that interprets the pattern string built by the rule compiler. The phonemes for the rule are then placed in the output, the current position is advanced over the matched graphemes, and the process is repeated until a rule consumes the leftmost grapheme.</Paragraph>
      <Paragraph position="11"> Since the rule set contains an unconstrained rule for each grapheme, the matcher will always find a rule, and will always make progress. Matched graphemes are not deleted; the word is left intact, since &amp;quot;consumed&amp;quot; graphemes could be part of the right-hand context of some future rule. The phoneme string generated by the letter-to-sound rule interpreter is represented as a double linked list. This representation was chosen 7 The right-to-left match has already been described. It should be pointed out that the use of &amp;quot;text&amp;quot; in &amp;quot;text string&amp;quot; was not ASCII, but an encoded alphabet in which some grapheme pairs, like qu, gu and certain others were encoded as single letters, because doing so made it unnecessary to have a large number of (unnecessary) blocking rules in the rules for the grapheme u.</Paragraph>
      <Paragraph position="12">  Divay and Vitale Grapheme-Phoneme Translation because subsequent processing (syllable marking, stress analysis, and final allophone adjustment) needs to be able to scan the phoneme string in both directions, and needs to be able to add and delete phonemes at arbitrary places, s It would be, of course, possible to use more elaborate string-matching techniques to increase the speed of rule selection, but this was not done in our system because letter-to-sound processing never uses a significant fraction of the total processing time.</Paragraph>
    </Section>
    <Section position="2" start_page="504" end_page="505" type="sub_section">
      <SectionTitle>
5.2 Examples of Rules for English
</SectionTitle>
      <Paragraph position="0"> Example 1 The following is an example of a set of two letter-to-sound rules for the letter c in English. The first is context-sensitive and the second context-free:</Paragraph>
      <Paragraph position="2"> This set reads as follows: The grapheme c is realized phonemically as \[k\] if occurring immediately before the grapheme a or o as in cab, cake, decal; it is realized as \[s\] elsewhere: cease, cigar. 9 Example 2 Such rules, of course, handle only those forms that constitute the set of assimilated or partially assimilated loanwords. In the case of the English rules above, words such as call, cell, cilia, cool would be handled, as well as cure, cute, (assuming that palatalization issues are handled by another rule). It does not account for words such as cello \['t felon3\] or concerto \[kon'tfC/otoLq, because these are unassimilated borrowings that still show the original Italian palatalization rule of: c --+ \[t~ / - {i,e}; When we have a rule that handles n words, where n is between 1 and some small number, say fewer than 7, we generally put these forms in a dictionary instead of using up computation to process such a small number of words. Similarly, even if a rule to convert e to \[~)\] (to handle words such as entree \['~)ntreI\], entente, or entourage) could be written, it would be much easier and more efficient to put the words it affects in a dictionary, because there are so few of them.</Paragraph>
      <Paragraph position="3"> Example 3 ation --* \[1\]\[eI\] = \[0\] \[f\] \[o\] \[n\] / - +; indicates that the string ation at the end of the word (morpheme boundary) is replaced by the phoneme string:  Computational Linguistics Volume 23, Number 4 * a syllable boundary =, * and a mark \[0\] of unstressed syllable for \[yon\], as, for instance, in aggravation.</Paragraph>
      <Paragraph position="4"> Example 4 This example shows the decomposition of words into their constituents morphs in such a way as to &amp;quot;undo&amp;quot; the mutations caused by suffixes. In some cases, the input string is modified to add a morpheme boundary, or replace the suffix. With the word finishing, a context-sensitive rule in ing would, for instance, produce the phonemes for ing plus a mark \[0\] indicating that the syllable is unstressed, add a morpheme boundary mark (+) in the input string, which is then finish+ing, and continue the conversion from right to left starting on h of finish.</Paragraph>
      <Paragraph position="6"> With the word riding, a context-sensitive rule in ing would produce the phonemes for ing plus a mark indicating that the syllable is unstressed, replace the suffix ing by e+ in the input string, which is then ride+, and continue the conversion from right to left starting on e.</Paragraph>
      <Paragraph position="7"> ing&gt;e+ --* ... / -+; With the word relationship, the rule decomposes the word into relation + ship: ship&gt;+ --* ... / -+; scandalousness is decomposed into scandal + ous +ness by the following rules:</Paragraph>
      <Paragraph position="9"> This suffix stripping is the main reason for a right-to-left scan for English (Allen 1976).</Paragraph>
      <Paragraph position="10"> Example 5 o --* \[~)1o~3\] / micr -; means o will be translated as \[~)\] if the syllable is stressed (micrometer), and as \[o~3\] otherwise (microgram). (See Section 5.7 for stress assignment)</Paragraph>
    </Section>
    <Section position="3" start_page="505" end_page="506" type="sub_section">
      <SectionTitle>
5.3 Normalization for English
</SectionTitle>
      <Paragraph position="0"> Text normalization, i.e., replacing numbers, abbreviations and acronyms, by their full text equivalents is done in a preprocessing section. In English, the choice between expansion to the full graphemic equivalence or expansion to a full phonetic equivalence was made in favor of the latter.</Paragraph>
      <Paragraph position="1"> English contains a separate preprocessing section for numbers (24 in twenty-four), acronyms (IBM, FBI), or abbreviations (Pr. for Professor, $ for dollar(s)). Some of these examples can become quite complex: $50 is retranscribed (or phoneticized) as fifty dollars; $50.60 as fifty dollars and sixty cents; $50 million as fifty million dollars; $50.2 million as fifty point two million dollars; and so on.</Paragraph>
      <Paragraph position="2">  Divay and Vitale Grapheme-Phoneme Translation Some characters may or may not be pronounced depending on the application (punctuation spelling for instance): 1 kg is a singular one kilogram but 5 kg is pluralfi've kilograms. Similarly, Dr. may be doctor or drive and St. may be street or saint, depending upon the context. We disambiguate and expand all such abbreviations in a separate module that by-passes letter-to-sound. There are switches that can be set, for example to turn all punctuation off, to turn it all on, or to normal pronunciation, where very few punctuation marks need to be pronounced. Any of these approaches works. The advantage of a separate text preprocessing module is that it does not clutter up the letter-to-sound rules. It can be optional, removed or replaced as necessary depending on the application.</Paragraph>
    </Section>
    <Section position="4" start_page="506" end_page="506" type="sub_section">
      <SectionTitle>
5.4 Homographs for English
</SectionTitle>
      <Paragraph position="0"> In English, homographs represent a common problem that cannot be solved entirely by letter-to-sound rules. There has traditionally been an avoidance of the problem by defaulting to one member of the pair based on blind form class selection (default to the noun), which, of course, is less than adequate. For example, in grapheme strings such as refuse and produce, the default to noun would be to \['refjuls\] and \['prodju's\], which, in unrestricted text, are less frequent than the verb forms.</Paragraph>
      <Paragraph position="1"> Later solutions in our system involved a default to the member with the higher frequency of occurrence. For example, using the same words, the default would be to \[rI'fjulz\] and \[pro'djuIs\] rather than to \['refju's\] and \['prodjuls\].</Paragraph>
    </Section>
    <Section position="5" start_page="506" end_page="507" type="sub_section">
      <SectionTitle>
5.5 Morphophonemics
</SectionTitle>
      <Paragraph position="0"> There are several rules for phonemic tuning, especially to account for morphonemic alternations, which are extremely important. For example, there are a number of essential morphonemic rules in English that perform various tasks, such as plural and past tense formation. These rules are very well known among linguists and need to be formalized in the same way as were the grapheme-to-phoneme rules. This time, however, we are always going from a morphophonemic to a phonetic realization as in: {x} ~ \[Yl / \[w\] - \[z\]; where {x} is an archiphoneme or abstract morphophoneme, \[y\] is some phonetic sequence, and \[w\] and \[z\] are some environment E, where E is either phonemic or phonetic. For example, the following are two well-known rules that implement the phonetic realizations for \[plural\] and \[past\]: 1deg After conversion, we have for roses, the following phoneme string: \[r\]\[o~\]\[z\]+\[z\]  {z} \[i\]\[z\] / \[+Cons,+Sib\]+-#; applies for the second \[z\], which is preceded by + (morpheme boundary), and by a sibilant consonant (\[z\]).</Paragraph>
      <Paragraph position="1"> After conversion, we have for cats, the following phoneme string: \[k\]\[a~\]\[t\]+\[z\] {z} ~ \[s\] / \[+Cons, -Voice\] + - #; 10 {z} and {d} are abstract base forms that are replaced by appropriate phones.  Computational Linguistics Volume 23, Number 4 applies for \[z\], which is preceded by + and an unvoiced consonant (\[t\]). After conversion, we have for spotted, the following phoneme string: \[s\]\[p\]\[~)\]\[t\]\[t\]+\[d\] {d} --* \[i\]\[d\] / {\[t\], \[d\]} + - #; applies for \[d\], which is preceded by + and by It\].</Paragraph>
      <Paragraph position="2"> After conversion, we have for walked, the following phoneme string: \[w\]\[ol\]\[k\]+\[d\] {d} --+ \[t\] / \[+Cons, -Voice\] + - #; applies for \[d\], which is preceded by + and by an unvoiced consonant.</Paragraph>
    </Section>
    <Section position="6" start_page="507" end_page="507" type="sub_section">
      <SectionTitle>
5.6 Syllabification
</SectionTitle>
      <Paragraph position="0"> A phone scanning, from right to left, marks the positions of the syllables according to consonant clusters, vowels, and morph boundaries.</Paragraph>
      <Paragraph position="1"> For instance, scandalousness, which has been processed by the previous steps as: \[s\] \[k\] \[a~\] \[n\] \[d\] \[o\] \[1\] + \[o I \[s\] + \[n\] \[i\] \[s\] is decomposed into syllables as follows: \[S\] \[k\] \[a~\] \[n\] - \[d\] \[o\] \[1\] + \[o\] \[s\] + In\] \[i\] \[s\] chevron would result in: and would be decomposed as: \[Jl \[e\] \[v\] \[r\] \[o\] In\] \[Jl \[e\] Iv\] - Jr\] \[o\] In\] Although there are several different theories of syllabification, any standard linguistics book will have a reference to these valid clusters and an accurate definition of the syllable for a language L (Clemens and Keyser 1983). It is beyond the scope of this paper to discuss the merits of one theory of the English syllable over another. Whatever theory is chosen, syllabification should serve as an accurate input into the module that handles stress. 11</Paragraph>
    </Section>
    <Section position="7" start_page="507" end_page="508" type="sub_section">
      <SectionTitle>
5.7 Stress
</SectionTitle>
      <Paragraph position="0"> The letter-to-sound rule set described above sets lexical stress in a wide variety of cases, especially where the word is monosyllabic or the suffixal information is sufficient to place primary or secondary stress.</Paragraph>
      <Paragraph position="1"> These routines contain special rules, which contain a number of different options:  and word mode (&amp;quot;speak word by word&amp;quot;). Such an interface can then be used in applications ranging from language pedagogy to the teaching of reading to individuals with learning disabilities.  place secondary stress n syllables to the left or right, assign \[-stress\] (not stressed) to a syllable, refuse stress.</Paragraph>
      <Paragraph position="2"> Example of letter-to-sound (morph) rules that would have already assigned primary stress:  The primary stress is one syllable \[Slleft\] to the left of graphy, so \[~)1~\] is stressed and the phoneme is \['~\] There are certain affixes in English that refuse to be assigned stress. For example the prefix in- normally does not take \[1 stress\] except under contrastive stress, e.g., I said include, not preclude. A word is scanned left to right and on syllables that fall under the category of stress-refusers, a flag is set. It is possible that more that one contiguous syllable will refuse to take stress.</Paragraph>
      <Paragraph position="3"> Generic stress rules in this module assign primary stress if and only if \[1 stress\] has not yet been assigned. In this block, the word is scanned left to right, the number of syllables is counted, and pointers are stored in syllable-initial position in an array A. The number of syllables in the root form is counted and the syllable that forces the primary stress is marked as \[1 stress\].</Paragraph>
      <Paragraph position="4"> Primary stress (\[1 stress\]) is a requisite for all words except certain words already marked otherwise in the dictionary and noun compounds. If at the end of these rules, \[1 stress\] still has not been placed on a word, a set of generic rules applies. First the number of syllables in the root is noted and a flag is set on that syllable with the most likely default for the placement of \[1 stress\].</Paragraph>
    </Section>
    <Section position="8" start_page="508" end_page="508" type="sub_section">
      <SectionTitle>
5.8 Allophonics
</SectionTitle>
      <Paragraph position="0"> The allophonic pass performs some allophonic rules well known to those familiar with phonemic variation.</Paragraph>
      <Paragraph position="1"> The phoneme string is scanned left to right, performing such tasks as vowel reductions. This is done in a prepass, to ensure that each \[o\] or \[i\] (reduced) vowel is accurately adjusted before the main body of the allophonic rules are run.</Paragraph>
      <Paragraph position="2"> The following are examples (a small subset) of (ordered) rules of the final allophonic pass: \[n\] --* \[13\] / - {\[k\],\[g\]}; pancake, previously transcribed \['p~enkeIk\], becomes \['p~eokeIk\]. \[s\]\[s\] -* if\] / -\[u'\]+; issue, previously transcribed \[' Issul\], becomes \[' lful\].</Paragraph>
      <Paragraph position="3"> Finally, one member of geminate pairs is deleted. There are some special pairs like \[1\] and \[L\] (syllabic \[l\]) that get deleted even if there is a morpheme boundary between them. Nevertheless, often these rules are blocked if they cross a morpheme boundary. \[d\]\[d\] --* \[d\]; This rule applies for adder, which is add+er but does not apply for midday, which is mid+day</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="508" end_page="515" type="metho">
    <SectionTitle>
6. The French Rule Set
</SectionTitle>
    <Paragraph position="0"> For French, an ad hoc programming language has been designed to easily define and modify the rule set. Text normalization, i.e., replacing numbers, abbreviations, and acronyms, by their full text equivalents, and grapheme-to-phoneme transcription can be achieved using this formalism.</Paragraph>
    <Section position="1" start_page="508" end_page="511" type="sub_section">
      <SectionTitle>
6.1 The Rule Formalism for French
</SectionTitle>
      <Paragraph position="0"> must be declared, i.e., the grapheme codes (upper and lower case letters, numbers, punctuation, diacritics) and the phoneme codes. These codes may be composed of one or more characters. In this way, users can define their own code, and the formalism can be used for different languages. These basic input and output units, or elements, are expressed as ei, where i is some number.</Paragraph>
      <Paragraph position="1">  external characters: ele2e3 is a string. A class is a set of strings having a common property. 'C1' and 'C2' are classes.</Paragraph>
      <Paragraph position="3"> block describes a process, taking the input text, processing it, and replacing it by the  end 6.1.4 Rules. The syntax of a rule is: (number): (Is) --* (rs)/(lc) - (rc);  where number is the rule label; Is (left string) is the string to be replaced; rs (right string) is the string replacing Is; Ic (left context) represents the strings to be found on the left side of Is; rc (right context) represents the strings to be found on the right of Is. lc and rc are formed with operands (characters, strings, classes) and operators (concatenation, logical or, negation).</Paragraph>
      <Paragraph position="4">  string. This process can be achieved using either one or two buffers. With one buffer, the rs string replaces the ls string, so the left context of a rule must be written according to the rules previously used. With two buffers, the writing of the left context of a rule is easier because the input string is only modified at the end of the block of rules. In effect, three contexts are usable: the left context and right context in the input buffer, and the left context in the output buffer. This left output context is written between angled brackets.</Paragraph>
      <Paragraph position="6"> The string AB or CD is replaced by FC, in the following contexts: * on the left of AB or CD, a H preceded by an element of the class 'C1',  Computational Linguistics Volume 23, Number 4 on the right of AB or CD, either an element of 'C2' followed by a G, or an element of 'C3'.</Paragraph>
      <Paragraph position="7">  3:HH -~ A /Non(C, E, H)-; HH is replaced by A if the left context is not a C, an E, or an H. 4 : CA ~ GE /(G.'C3'/H-; CA is replaced by GE if: * the left context of the output buffer (between angle brackets) is an element of 'C3' preceded by G, * and the left context of the input buffer is H. 6.1.7 Interpreting the Rules. The rule having the longest match between the set of all the Is strings of the block, and the string beginning with the next character to be  processed in the input text, is searched first. If both contexts are true, the rule applies, otherwise another rule is searched for, first any other rule with the same Is, and then in decreasing length of Is matches.</Paragraph>
      <Paragraph position="8"> Let us consider the following rules:</Paragraph>
      <Paragraph position="10"> and the input string, &amp;quot;ABCE&amp;quot; to be processed.</Paragraph>
      <Paragraph position="11"> The longest match between the left string (Is) of the rules in the block, and the input string to be processed is searched. In this case the longest match is &amp;quot;ABC&amp;quot;. So, rule 52 is tested. If the contexts are true, the rule is applied, and the next character to process is &amp;quot;E&amp;quot; in the input string. If the context is false, the rules are tested in decreasing order of the longest match. Rules with &amp;quot;AB&amp;quot; as ls are tested in the order in which they are written (50, 53). Then if no rule has yet been applied, rule 51 is tested.</Paragraph>
      <Paragraph position="12"> If no rule is true, the first character A to process is copied into the output buffer, and the procedure starts again with the next character B. The order in which rules are tried is: 52, 50, 53, 51. The order in which the rules are written is significant only for those having the same Is.</Paragraph>
      <Paragraph position="13"> Using the formalism of the expert system, the expert is in charge of defining a set of rules to simulate his or her expertise.</Paragraph>
    </Section>
    <Section position="2" start_page="511" end_page="512" type="sub_section">
      <SectionTitle>
6.2 Examples of Rules for French
</SectionTitle>
      <Paragraph position="0"> As this paper is in English dealing with the French language, and in the event that the reader might be not familiar with the idiosyncrasies of French, only a few examples will be given to explain the mechanism of the letter-to-sound rules for French.</Paragraph>
      <Paragraph position="1">  is pronounced \[o\] in moto, loto, solo.</Paragraph>
      <Paragraph position="2"> is pronounced \[w\]\[a\] in moi, pois, lois.</Paragraph>
      <Paragraph position="3"> is pronounced \[5\] in bon, but not in abandonner, bonheur, or bonne, where the rule for o applies.</Paragraph>
      <Paragraph position="4"> oin is pronounced \[w\]\[PS\] in loin, poing but not in avoine where the rule for oi applies.</Paragraph>
      <Paragraph position="5"> The rules could be written as shown below.</Paragraph>
      <Paragraph position="6">  Independently of the order of the rules, the rules having the longest match will be first tested. Here, the order of the rules is irrelevant.</Paragraph>
    </Section>
    <Section position="3" start_page="512" end_page="513" type="sub_section">
      <SectionTitle>
Example 7
</SectionTitle>
      <Paragraph position="0"> er at the end of words is pronounced \[el as in chanter, danser but \[Er\] in super, joker, fer, or hier.</Paragraph>
      <Paragraph position="1"> The rules could be formulated as: 'Wer': sup, jok, f, hi/ 9:er --* \[E\]\[r\] /_.'Wer'- _; er is pronounced \[~\]\[r\] if er if preceded by an element of 'Wer' (words ending in er) preceded by a space and followed by a space.</Paragraph>
      <Paragraph position="2">  The ai string in French words like bienfaisant, con trefaisait, faisait, faisan, satisfaisant, etc., is pronounced \[o\] but not in faisceau, chauffais where the corresponding phoneme is an \[E\].</Paragraph>
      <Paragraph position="3">  Volume 23, Number 4 Computational Linguistics The rule can be written as: 'Vowels': a, e, i, o, u, y/ 11: fais --* \[t~\[o\]\[z\] / -'Vowels'; fais is pronounced \[foz\] iffais is followed by an element of the class 'Vowels'. Example 9 In order to eliminate geminates, one possibility is to analyze the last character sent to the output buffer.</Paragraph>
      <Paragraph position="4"> 12: b ---* /(\[b\])-; b is eliminated if the left context in the output buffer is already a phoneme \[b\]. (See Section 6.1.5 on using one or two buffers.)</Paragraph>
    </Section>
    <Section position="4" start_page="513" end_page="514" type="sub_section">
      <SectionTitle>
6.3 Normalization for French: from Graphemes to Graphemes
</SectionTitle>
      <Paragraph position="0"> The first step, done by a block of rules, is to normalize the text, replacing numbers, abbreviations, and acronyms by their full text equivalents. Both input and output are graphemes. Normalization is handled in the letter-to-sound rule set and in a preprocessing module. By rules, the contexts indicate if the replacement is required.</Paragraph>
      <Paragraph position="1"> Numbers. 123 is rewritten as cent vingt trois by a set of rules checking the left and right context for each digit.</Paragraph>
      <Paragraph position="2"> 'Digit' : 0,1, 2, 3,..., 9/ is the class for digits</Paragraph>
      <Paragraph position="4"> kg is replaced by kilos in 5kg or trois kg.</Paragraph>
      <Paragraph position="5"> Acronyms. Similar rules are used to spell acronyms (I.B.M. gives \[Ibe~m\]): 17:B. --*b6 /_,.-; B followed by a point is replaced by b~ (spelled) if B is preceded by another point or a space. In I.B.M., or vitamine B., B is spelled correctly.</Paragraph>
      <Paragraph position="6"> Preprocessing procedures are also used in cases like $50, which gives: cinquante dollars and where you have to permute $ and 50.</Paragraph>
    </Section>
    <Section position="5" start_page="514" end_page="514" type="sub_section">
      <SectionTitle>
Divay and Vitale Grapheme-Phoneme Translation
6.4 Morphology
</SectionTitle>
      <Paragraph position="0"> The problem mentioned in Section 3.3 is solved most of the time using rules for French.</Paragraph>
      <Paragraph position="1"> For words like those in Section 3.3 (homosexuel, h~t~rosexuel, tOl~siOge, entresol, tournesol), a class is defined with the prefixes ending with a vowel.</Paragraph>
      <Paragraph position="3"> s is pronounced \[s\] if preceded by an element of 'Prefix', and followed by an element of 'V' (a vowel) as in t616si6ge.</Paragraph>
      <Paragraph position="4"> 19:s --, \[z\]; as in base, bise, anglaise, opposition</Paragraph>
    </Section>
    <Section position="6" start_page="514" end_page="514" type="sub_section">
      <SectionTitle>
6.5 Homograph Problem
</SectionTitle>
      <Paragraph position="0"> A limited parsing has been done using the same formalism as letter-to-sound. A dictionary lookup gives one or several grammatical categories for the most common words.</Paragraph>
      <Paragraph position="1"> By examining the left and right words, it is possible in most of the cases to get an idea of the grammatical categories of the unmarked words or to reduce (to one if possible) the set of potential grammatical categories for each word of a sentence. The same formalism allows the processing of grammatical categories (verb, adverb, preposition, etc.) instead of characters for transcription. A class is a set of grammatical categories (Divay 1984, 1985).</Paragraph>
      <Paragraph position="2"> If the grammatical category is known (where V = Verb), it can be used in the rules: 20 :ent(V) --+ / -_; ent is eliminated at the end of a word (right context is a space) if the word is a verb (ils chantent).</Paragraph>
    </Section>
    <Section position="7" start_page="514" end_page="514" type="sub_section">
      <SectionTitle>
6.6 Linking
</SectionTitle>
      <Paragraph position="0"> In some cases, a new phoneme is added between two words of a same-breath group.</Paragraph>
      <Paragraph position="1"> For instance, a \[z\] phoneme is added between the two words of les enfants. The second word has to begin with a vowel or aspirated h, and the first one to end with n, s, d, t, x, or z. It depends also on the grammatical category of both words. This problem also is solved by rules, such as the following: 21 : _ --* _\[z\] /des, _des, _ses, _nous - 'Vowels'; The space between two words is replaced by a space and a phoneme \[z\] if the space is preceded by les or des, etc., and followed by a vowel, as in les enfants.</Paragraph>
      <Paragraph position="2"> If the left context is very large, a new class can be created, and used as left context.</Paragraph>
    </Section>
    <Section position="8" start_page="514" end_page="515" type="sub_section">
      <SectionTitle>
6.7 Elision: From Phonemes to Phonemes
</SectionTitle>
      <Paragraph position="0"> Some rules, mostly rules dealing with mute e and semivowels, can be more easily expressed on the phonemes strings. This is a new block of rules run after the grapheme-phoneme conversion.</Paragraph>
      <Paragraph position="2"> Mute e is eliminated before a vowel phoneme and after a consonant phoneme followed by a vowel phoneme as in emploiera \[~plwaora\], which becomes \[~plwara\].</Paragraph>
      <Paragraph position="3"> Elision often occurs at the end of words (petite), or in the middle of words (emploiera, tellement). It can be done in the first syllable (pesanteur, retard, teneur) except if there are two consonants as in premier. It is never done if suppressing \[o\] would result in three or more consecutive consonants.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="515" end_page="517" type="metho">
    <SectionTitle>
7. Testing
</SectionTitle>
    <Paragraph position="0"> No standardized tests exist for evaluating letter-to-sound systems, although some researchers are beginning to look at the problem in order to determine whether one approach has merit over another (Golding and Rosenbloom 1993). For example, the Oregon Graduate Institute is currently investigating letter-to-sound rules done in more traditional ways and comparing them to neural network learning.</Paragraph>
    <Paragraph position="1"> Tests can be done:  1. with or without an exception dictionary lookup running before the rules, 2. on text extracted from papers, books, magazines. In that case, the same words is counted as many times as it appears in the text. This is especially true for linking words (one, a, the, is, etc.), which are then counted many times. A more systematic test can be carried out using an electronic dictionary, having for each entry (grapheme string) the corresponding phoneme string. In that case, every word is tested and counted one time, even though its occurrence frequency might be very low, 3. in terms of percentage of phonemes or of words correctly transcribed. Percentage of phonemes is obviously higher than percentage of words.</Paragraph>
    <Section position="1" start_page="515" end_page="516" type="sub_section">
      <SectionTitle>
7.1 English Analysis
</SectionTitle>
      <Paragraph position="0"> The rule set for English consists of about 1,500 rules containing morphs as well as nonsemantic grapheme strings. An exception dictionary has been defined for words not correctly translated by these rules. These consist mostly of functors, abbreviations, homographs, and unassimilated loanwords such as adobe, bayou, cello, coyote, and the like. In addition, the lexical entry need not contain phonetics, especially if the entry in question is adequately handled by rule. It may, however, still be used to convey both syntactic and semantic information that would then serve as input to a parser for more accurate prosodic rules.</Paragraph>
      <Paragraph position="1"> In this study, we took two different corpora: (1) a 1,676-word corpus originally used by Bill Huggins (BB&amp;N) and eventually by Dennis Klatt (MIT). This corpus was chosen because it consists of complex polysyllabic forms; (2) a sample taken from the Brown corpus (19,837 words), which we felt to be sizable enough and representative enough to use to examine letter-to-sound accuracy.</Paragraph>
    </Section>
    <Section position="2" start_page="516" end_page="516" type="sub_section">
      <SectionTitle>
Divay and Vitale Grapheme-Phoneme Translation
</SectionTitle>
      <Paragraph position="0"> On the Huggins corpus, without the use of the exceptions dictionary, our rule set scored 94.9% of words. The 5.1% errors consisted mainly of incorrect morphological analysis and consequent inaccuracies in lexical stress placement.</Paragraph>
      <Paragraph position="1"> On the Brown corpus, we had a large number of dictionary hits, which was not unexpected since the corpus contains many high-frequency forms. Out of a total word count of 19,837 words, the dictionary hit count was 7,337 (36.99%); the rules matched 5,432 words (27.38%) for a total word match of 12,769 or 64.37%. Of the words missed, 3,905 (19.69%) missed by only one segmental phoneme or phone and 3,636 (18.33%) had incorrect stress placement. We consider incorrect stress placement to be a more serious error than one incorrect segmental phoneme.</Paragraph>
      <Paragraph position="2"> The latest version is used in different products, from text-to-speech synthesizers both hardware and software, assistive devices, and games, and will soon be used in proper name retrieval, both on computer systems and over the telephone. Using the same formalism, a different set of rules has been defined for proper names found in a typical telephone book in the US and could be extended to other languages.</Paragraph>
    </Section>
    <Section position="3" start_page="516" end_page="517" type="sub_section">
      <SectionTitle>
7.2 French Analysis
</SectionTitle>
      <Paragraph position="0"> The set of rules for French consists of about 600 rules and 100 classes. Some of these classes contain 100 or more elements. The French letter-to-sound rule set was tested on the 55,000 unique word Le Petit Robert dictionary, and the 100,000 word Le Grand Robert de la Langue Francaise dictionary. An exception dictionary is automatically defined for words not correctly translated by these 600 rules.</Paragraph>
      <Paragraph position="1"> The execution of the set of rules on the 55,000 unique word dictionary gives 4.4% of words whose pronunciation is different from the dictionary Le Petit Robert but is acceptable from the authors' point of view. These differences are only due to a mismatch between open or closed phonemes for phonemes \[a\], \[e\] and \[o\].</Paragraph>
      <Paragraph position="2"> The distinction between the open \[a\] and closed \[ct\] has almost disappeared in France in favor of the open \[a\]. The proposed pronunciation varies even from one dictionary to another. Words like accablant, phase, c~ble, vase, and trois have different pronunciations depending on the dictionary used. Sometimes, both are mentioned.</Paragraph>
      <Paragraph position="3"> They are even differences between Le Petit Robert and Le Grand Robert dictionaries.</Paragraph>
      <Paragraph position="4"> Both open \[o\] and closed \[o\] are also acceptable in many words e.g., automobile, a~rodrome, augmenter, autonome, austral, ozone. Nevertheless for some words, the distinction has to be made (bol \[bD1\] vs. rose \[roz\]). The closed phoneme is used for instance before a phoneme \[s\]: pose, chose, oser or at the end of a word: abricot, escargot.</Paragraph>
      <Paragraph position="5"> The closed \[e\] and open \[~\] are also very much interchangeable in many words (les, baisser, adolescent, essai, agressif, blessant, int~ressant, aigri, biennal, accession). Of the 55,000 words, 2.8% are incorrectly processed (1,500 out of 55,000), and have to be added to an exception dictionary. Some words have several acceptable pronunciations (aoF~t \[aut, ut\], ananas \[anana, ananas\], dompter \[dSmpte, dSte\]), bat, babil, blet, chenil, exact, but, as, but only one is stored in the electronic dictionary. Some problems result also from a different but acceptable elision of mute e, as in chemin de fer, briqueterie, petit-neveu, amenuiser, point de vue, porte-b~b~, redevenir. But, most of the errors come from foreign words such as: accelerando, adagio, allegro, artefact, posteriori, mea culpa, beluga, placebo, torero, baby, girl, shirt, blue-jeans, base-ball, steward, business, building, copyright, bonsaY.</Paragraph>
      <Paragraph position="6"> The number of applications of each of the 600 rules has been calculated on the 55,000 words to give an indication of its weight.</Paragraph>
      <Paragraph position="7"> This program is currently in use in different laboratories in France, Canada (O'Shaugnessy et al. 1981) and the United States (DEC) as the first level in speech  Computational Linguistics Volume 23, Number 4 synthesis for French. It has been used by various companies producing electronic board speech synthesizers for French.</Paragraph>
      <Paragraph position="8"> This transcription program has also been used to create a phonetic index and retrieve a word without knowing how to write it. The word is converted to phonetics and searched for in the phonetic dictionary index (used in both CD-ROM dictionaries Le Grand Robert and Le Petit Robert) (Rey et al. 1989). For information retrieval, open and closed phonemes are always considered identical. The same mechanism (using phonetics) is used to retrieve a proper name (without knowing how to spell it) through the 30,000 proper names of the phone book of the city of Dakar (Senegal). The system is also used in the Taurus multimedia database software (from DCI: Data Concept Informatique) to create an index on one field of a structure defined by the user of the database, and to retrieve the corresponding information even if it is misspelled.</Paragraph>
      <Paragraph position="9"> Other similar uses are under investigation for the pronunciation of names from on-line telephone books in particular and telecommunications applications in general (Alcatel</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="517" end_page="518" type="metho">
    <SectionTitle>
TITN Answare).
8. Final Remarks
</SectionTitle>
    <Paragraph position="0"> It is beyond the scope of this paper to discuss letter-to-sound procedures in languages other than English and French. However, the disparate nature of different languages argues for a brief mention of our experience in developing letter-to-sound rule sets in other languages. 12</Paragraph>
    <Section position="1" start_page="517" end_page="517" type="sub_section">
      <SectionTitle>
8.1 Simple Systems
</SectionTitle>
      <Paragraph position="0"> In certain languages, as diverse as Spanish and Swahili, letter-to-sound rule sets are extremely easy to produce, due to the extremely close fit between orthography and its phonemic/phonetic equivalent. First, there are many languages that developed a writing system only recently. Swahili, for example, was written in Arabic script until 1850 when Krapf, a German missionary, introduced the Roman alphabet to the Bantuspeaking peoples of the East African coast. Consequently, in the time span of less that 150 years, the phonological and phonetic systems of the language have not had time to change to any significant extent. Secondly, many languages have undergone some spelling reform. Czech, for example, underwent spelling reform fairly recently and the orthographic system was brought into line with the phonological and phonetic system. Third, there are some languages in which the orthography had a close fit with the phonemic system. Spanish, for example, is a simple system in that there is an almost iconic relationship between graphemes and their phonemic equivalent. In fact, even lexical stress is marked in many forms and, where it is not, it is almost always predictable.</Paragraph>
    </Section>
    <Section position="2" start_page="517" end_page="518" type="sub_section">
      <SectionTitle>
8.2 Mid-Level Systems
</SectionTitle>
      <Paragraph position="0"> Many languages are somewhat more complex and fit into a second category of languages of mid-level difficulty. German, for example, has a large morphological system yet it is surprisingly simple in terms of letter-to-sound rules. If one lists a large number of common morphemes, it becomes a simple task to state an accurate set of letter-to12 All languages of the world are of an equal degree of complexity. Primitive languages are a myth perpetrated by early anthropologists, missionaries, and adventurers. However, when we compare different subsytems of any two languages, it quickly becomes clear that.subsystems are vastly different in complexity. This is true of the phonological, phonetic, morphological, syntactic, semantic and letter-to-sound subsystems of two different languages; some are an order of magnitude more complex than others.</Paragraph>
    </Section>
    <Section position="3" start_page="518" end_page="518" type="sub_section">
      <SectionTitle>
Divay and Vitale Grapheme-Phoneme Translation
</SectionTitle>
      <Paragraph position="0"> sound rules. Many languages with a high synthetic index (Greenberg 1990) fall into this categoryJ 3</Paragraph>
    </Section>
    <Section position="4" start_page="518" end_page="518" type="sub_section">
      <SectionTitle>
8.3 Complex Systems
</SectionTitle>
      <Paragraph position="0"> Certain languages, such as English and French, are among the most complex languages to construct letter-to-sound rules for. These are not the only languages in this last category. Any language with an old writing system that has not undergone a modicum of spelling reform but has undergone dramatic phonological, morphonemic, and morphological changes will probably fall into this category.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML