File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/79/j79-1057_metho.xml
Size: 46,751 bytes
Last Modified: 2025-10-06 14:11:16
<?xml version="1.0" standalone="yes"?> <Paper uid="J79-1057"> <Title>Strong First Syllable Rule - A ;</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> DHON~LOGICAL RULES FOR A TEXT--TO-SPEEUf# SYSTEM SHARON HUNNICUTT </SectionTitle> <Paragraph position="0"> This work was supported originally by the Joint Services Electronibs Program, Contract DAAB07-71-C-0300, and more *recently by the National Science Foundation (Grant EPP74-124353) Copyright (C c, )1976 ~ssociation for Computational Linguistics Summary The phonological rules discussed in this paper are part .of a system which has beenunder development at M.I.T. to convert unresfricted text to speech. .- The system utilizes a morph lexicon and a vocal tract model. Although most of the 1i.nguist.i~ analysis is done by decomposing words into their constituent morphemes, such a system 1s not sufficient for unrestricted text. In order to attain the competence of a comprehensive system, it was necessary to develop a scheme for dealing with unrecognizablq words. Thfs is called the &quot;letter-to-sound&quot; system.</Paragraph> <Paragraph position="1"> When a decomposition fails, that is, when a word cannot be decomposed into Its cons-tituent motphs or when it is too infrequent in the English laneuage to be included in the morph lexicon, the &quot;letter-to-sound&quot; system is invoked. The letter string which it receives is converted into a stressed phoneme gtring using two sets of ordered phonological ruled. The first set to be applied convects letters to phonemes, first stripping affdxes, then converting consonants and finally converting vowels and affues. The second set applies an ordered set of rules which determine the stress contour of the phoneme string.</Paragraph> <Paragraph position="2"> These rules were developed, by a process of extensive statistical analysis of English words. The form of the rtiles rerlects the fact that pronunciation of vowels and vowel digraphs, consonants and consonant clusters, and prefixes and suffixes is hikhly dependent upon context. The method of ordering rules allows converted strings which are highly dependable to be used as context for those requiring a more complex framework. Detailed studies of allowable suffix combinations and the effect of .;~iffixati,on on stress and vowel quality have also provided for more reliable rcsutlts.</Paragraph> <Paragraph position="3"> 1. Application of Letter-to-Sound Rules 2. Cyclic Rules (Flrst Phase); Domain of Application 3. Notation 4. Stress Rules - Flow Chart 5. Stress Placement Rules The approach has been to model the proces-s employed by a nat'ive speaker of English when reading aloud. In order to develop correct computational algorithms for the pronunciation of English words, it has been necessary to reflect the basic nature of linguistic processes. Consequently, considerable emphasis has been placed on the development of morphological and phon~logical analysis, stress patterns, parsing systems and prosodic correlates.</Paragraph> <Paragraph position="4"> It i$ possible, using the current system, to convert any English word or string df words in a textual representation to intelligible speech. In order to effect this conversion, a number of subsystems are utilized including mo~phological analysis, letter-to-sound rules, stress rules and phonemic speech synthesis. Prosodic studies are now in grogress; experimental parameters for fo contours and timing patterns will soon be included In %he system.</Paragraph> <Paragraph position="5"> The letter-string which represents a word is usually converted to a phoneme string in preparation for speech synthesis by a process of morpholop,ical analysis. Amorph lexicon c-ontnininq approximately 11,000 entries has been drv~l loped and is ilsed in ron jrinct ion with n rnorph dec*ompos it ion algorithm. Included in the lexicon are two major classes oE rnorphs. One</Paragraph> <Paragraph position="7"> class is composed of roots such as trust&quot; and snow, i .c. ; words which cqlr occur alone, and bound roots such as -ceive perceive, receive, connive), rot- (rotary, rotate, rotor) and -miss- (dismiss, missive, permiqsiveness).</Paragraph> <Paragraph position="8"> APS fixes make up the second class and may be attached either to roots or tc bound morphemes. Accompanying each lexical entry is its phonemic representation, its morph class and its part(s) of speech.</Paragraph> <Paragraph position="9"> Algorithfnic decomposifion of letter-strings models the pr~cedure used by a native speaker when confronted by a word which he does not immediately recognize or has previously not encountered. If the word is not immediately</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 I </SectionTitle> <Paragraph position="0"> recognizable, i,e., if ~t is not in the le~icon&quot; of the native speaker in its entirety, an attempt will be made to break it apart into its constitu'ent morphs. Such a process is probably used when oAe reads a word skh as &quot;antidisestablishmentarianism,&quot; earthrise&quot; or &quot;cranappley for the Tirst time The algorithm also models the ability to recognize mutations such 2s the dropping of a fLbal silent i el4 (observe - observance), the doubling of a final consonant (red-reddest') and the substitution of ti1 for final [yl preceding vocalic suffixes (glory - glorious). Morphophonemic rules are also included, modeling the ability to give a correct pronunciation for any plural (horses, cats, dogs) or past tense (quieted, hushed, whispered), and to palatalize in appropriate contexts ( sculpt - sculpture, confuse - confusion) .</Paragraph> <Paragraph position="1"> Another feature of the algorithm is a set of selectional rules whit\, although very simple in form, choose the corpect morphemic analysis from all possibilities in a large number of cases. A standard form for sequences of morphemes is compared with each possibiiity, and rules describing preferred composition are used in a pairwise comparison leading to an acceptable result.</Paragraph> <Paragraph position="2"> The word &quot;formally&quot; provides an example in which the only rule needed is one stating that a root followed by a suffix is preferable to two concatenated roots. Possible decompositionsof the word &quot;formally&quot; are</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 t </SectionTitle> <Paragraph position="0"> detected as foklows (R represents root1' and $, tlsuf-fix&quot;) : form (R) + a1 (S) + ly (S) f om (R) + ally (R) for. (R) ,+ mall (R) + y (S) form (R) + all (R) + y (S) It is clear that the correct decomposition is the only candidate having the form of a single root followed by suffixes.</Paragraph> <Paragraph position="1"> When a complete morphemic analysis fails, that is, when a word encountered by a'native speaker cannot be se~arated into its constituent morphemes, or when it is so infrequent in the English language that he has previously not encountered it, a &quot;letter-to-sound&quot; system is invoked, i.e., an attempt is made to sound out the word letter by 'letter. The competence of a native speaker which allows him to perform this convemsion is based on correspondence& between l'etters and sounds in English which have been internalized through experience. (A native speaker will also apply the same correspondences to a foreign-word from any language of which he has nQ knowledge.) A scheme; to model this process,musE be made available, sequenced after the decomposition algorithm, in,order to be ablk to convert unrestricted text to speech. The text-ro-speech system includes a phonological model having a two-phase structure: the first phase is 3 set of rules which converts letters to phonemes, and the second is an algorithm for placement of stress on the converted phonemes.</Paragraph> <Paragraph position="2"> This system has been implemented ;In RCPI, on n DEC PDP-9 and on a PDP-10 inbMAC LISP.</Paragraph> <Paragraph position="3"> Onp might ask why, with a set of letter-to-sound rules available, it is necessary to have a lexicon. There hre three reasons; all are restrictions which must be imposed om any viabl'e letter-to-sound system. -First, it has been observed that high-frequency wotds perhaps because of extensive use, do nbt always follow letter-to-sound rules. For example, the only instance in which a final Lf] is pronounced as /v/ is in the word &quot;of If The letter [w]</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 9 I </SectionTitle> <Paragraph position="0"> following a consonant is generally pronounced like the [w] in sweett1 or &quot;tw~ll,&quot; but in the word &quot;two&quot; it is not pronounced at all. A study5 of the 200 most frequent words in English according to the Bl'own corpusb as made to determine their regularity of pronunciation. It was found that although the regular case is that a final [el preceded by a single consonant (other than [rJ) lengfhhns the preceding vowel, four of the 200 words, i.e., &quot;haveft (compare It &quot;behave, &quot; &quot;shave&quot;) , one,</Paragraph> <Paragraph position="2"> some&quot; and come,&quot; (compare &quot;ldne,&quot; &quot;ode&quot;) are exceptions. The case of initial [th] is even more irregular among highI-- null frequencyword$.</Paragraph> <Paragraph position="3"> I'n most English words, initial ;th j is unvoiced as in If Jf 11 &quot;thistle, fKin,&quot; &quot;thesis., However, twelve of the 200 words began with voiced [thl.</Paragraph> <Paragraph position="4"> Secondly, it must be recognized that the fetter~to-sound rules which operate within a morpheme, do not, ~ecessarily apply across morph bounddries. In particular, the pronunciatian of compounds tequires a lexicon. Such-words as &quot;hc&house&quot; or &quot;potherbff might etherwise appear to contain the consonant cluster lth], and the motph-final srlent e] in &quot;houseboatf1 would certain1 y not be silent if the word were not recognized as a compound. The application of letter-to-sound rules must therefore be restricted to words containing no more than a single root.</Paragraph> <Paragraph position="5"> Thirdly, foreign words which retain their original prmunciation must be lexical entries. The entries may be made in the same way a native speaker of English muld add a foreign word to his vocabulary, i.e., by pronouncing it as if 13 were an English word (using English letter-to-sound rules) until informed of its correct pronunciation, and then plaCing it in his mental lexico?.</Paragraph> <Paragraph position="6"> It is apparent, then, that both morphnlogical and phonological systems are necessary and that together, in sequence, they can provide a phonemic representation for any English word presented for conversion to speech.</Paragraph> <Paragraph position="7"> Although, at present, there is no ingergction between the decomposition and letter-to-phoneme algorithms, a more highly efficient System could be developed in the future. The size of the lexicon could be reduced, f6r example, by app.l!ication of stress placement rules to the output 8f the decomposition algorithm or by omitting unnecessary phonemic representations The c~nversfon of a letter string to a phoneme string in the letter-to-sound program prdceeds in three s-ges. In the first stage, prefjxes and suffixes are detected, (cc. Figure I) Such affixes appear' im the list.of phonological rules. Each is classified according to :13 its po~sible parts of speech, (2) the possible arts pf speech bf aasuffix preceding it, f%) its restriction or lack of restriction to word-final pos&quot;i5op and (4) its*abitlity to change a preceding Ly] to [i] or to cause the omission of a preceding [el. Prefixes aye given no further specification.</Paragraph> <Paragraph position="8"> Detectio~ of suffixes proceeds in a xight-to-left, longest-match-first fashion. When no additional ~ffixras c-an be d'etected, or; FJhe~ a possible suffix is judged as syntacSically incompatible with its right-adjacent suTfix by a parts-of-speech test using classifications (I) and (2) above, the pacees is terminated. Finally, prefixes are detected 2eft-to-sight, a1so.b~ longest match first. If at any time the removal of an affix would leave no consonant or no vowel in the remainder of the word, the affix is not removed.</Paragraph> <Paragraph position="10"> ship: (a) nominal suffix (b) follows nominal suffix or: (a) nominal suffix (b) fo$lows verbal suffix ate: (a) verbal, nominal and adjectival (b) fol~ows verbal, nominal and ad j ectival suffixes dict + ate + or + ship possible suffix analysis parts of speech are compatible; analysis accepted.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Example : passing </SectionTitle> <Paragraph position="0"> pa 4- s + s + ing possible suffix ana&y.qis ing: (a) nominal and vefbal suffix (b) follows nominal or verbal suffixs: (a') nominal and verbal suffix (b) follows nominal and verbal suffixes (c) appears only in unacceptable ana3ysis in word-final positiorr pass 4- ing correct analysis Example; finishing fin + ish 4- ing ing: (a) nomingl and verbal suffix (b) follows nominal or verbal su5f ix ish: (a) adjectival (b) follows nominal c>Y ad j PC t ivxl su Ef ix finish + ing possible suffix analyslg parts of speech not compatible correct analysis: root functions. as verb with verbal endxng, is11 The domain of application of the second stage rules excludes any previously recognized affixes and is assumed to be a single morpheme. This stage is intended primarily for consanant rules and proceeds from the left of the string to the right.</Paragraph> <Paragraph position="1"> Extsnding the domain to the whole letter string once again for the third stage, a phonemic representation is given to affixes and to vowel6 and vowel digraphs, (cf. Figure 1). Phonemic repr'esentations are produced by a set of ordered rules which convert a letfer string to a phoneme sGri&g in a given context. Bgth left and right contexts are permitted in the expression of a rule, and may contain, variables as well as letters or phopemes Any one context may be composed of either letters and letter variables or of phonemes and phoneme vpriables-Combination of these possibilities for both left and right contexts allows fog four possible co~text types. One cype QPS rule, for exapple, makes it possible to convert a particular letter string to a phoneme string only if the left context is a specified phoneme string and the right context is a specified letter string.</Paragraph> <Paragraph position="2"> The method of ordering rules allows converted strings which are highly dependable to be used as context for those requiring a more complex framework. null Because the pronunciatioh of consonants is least dependent uDon context, phonological rules for consonants are applied first, i.e., in the second stage. Rules for vowels and affixes, requiring more specification of environment, are applied in the thiddand final stage. With the benefit of a previously converted consonant framework and the option of including as context any phoneme to the left of a string under consideration, the task of converting, voweks and affixes is simplified.</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> DENOMINAT IONS Input </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> Stage 1: (a) recognition and isolation of suffixes (b) rep-ognition and isolation of prefixes Stage 2: conversion of consonants in</Paragraph> <Paragraph position="4"> Result of Stress Placement Rules All phonemes are given in IPA symbols. A dash ) serves as a place-holder for a letter which has not yet bpen converted; an equals sign (=) follows each prefix; a plus (+) precedes each suffix The result, of stress placement rulev is also given.</Paragraph> <Paragraph position="5"> Figure 1 Application of Letrer-to-Solin-d Rules within the two sets of rules for conversion of consonants and vowels, ordering proceeds from longer strings to shester strings and, for each string,,from specific context to general context.</Paragraph> <Paragraph position="6"> The rule for pronunciation uf [cch] , then, appears before rhe rules for [cc] and ich], each of which is ordered before rules for LC] and [h].</Paragraph> <Paragraph position="7"> Procedures for the recognition of prefixes and suffixes also require an ordering: t'le pfefixes lcomj and [con] must be ordered before [co]; any suffSx ending with the letter Ls] must be recognizpd before the suffix consisting of 'that letter only.</Paragraph> <Paragraph position="8"> As an example of ordering rules for a particular string, consider the vowel [a] and .assume that it is followed by the Letter Cr]. This [a f</Paragraph> </Section> <Section position="6" start_page="0" end_page="34" type="metho"> <SectionTitle> 1 I </SectionTitle> <Paragraph position="0"> may be pr,onounced like the [ a? in &quot;warp, &quot; &quot;lariat&quot; or carp&quot; depending upon spebif ication of further context. It is pronounced like the [ a] in I I carp&quot; if it is followed by [r] and another consonant (other than Lr]) and if it is preceded by any consonant phoneme except /w/ (n,ote &quot;quarter, &quot;wharf&quot;) Consequentlv, a rule for [a1 in the context of being preceded by the phoneme /w/ ~d.followed by the sequence [r~] is placed in the setof rules. ~~rcif'ication of a left context in the rule for the [a] in carp&quot; is subsequentl! Unnecessary. If the [a] is preceded by a /w/, this rule will never be reached; if preceded by a vowel, a rule for vowel digraph9 will already have- applied. Using this method. rules may be stated simply and without redundancy.</Paragraph> <Paragraph position="1"> Development of the set of phonological rules was begun by informal inspection and reference to published works, e.g., Venezky. By ,a process of extensive statistical analysis, other rules were added and ordered appropriately. The. principal source of words was the Merriam Websteb .Pockef Dictionary. A computer print-out was generafed in which all words containing each letter and each specified cluster of letters were isolated. Within each category, words were sorted alphabetically according to the riglit-hand context of the letter(s) under considaration. In addition, walker's rhyming dictionary lo was used to determine pronunciation of suffixes and the ef fect of suffixation on preceding phonemes. Words from Lb Brown Corpus, the I1 12 Heritage Engllsh Dictionary ahd Stedman's Medical Dictionary have heed used in testing procedures.</Paragraph> <Section position="1" start_page="11" end_page="11" type="sub_section"> <SectionTitle> Examples Of Rule Application </SectionTitle> <Paragraph position="0"> In this section, a number of words will be analyzed according to the phonological rule program. I'ntermediate output, i.e.,.th'e results of the first and second stages, will be provided for each word, and the rules which have been applied to produce this output will be discussed. Generalkations of these rules and rules.which are believed to he related will b% included in the discussion whenever possible. All phonemes axe given in IPA Symbols; a dash (- is a place-holder for aletter which is to be converted in a later stage. The result of application of stress rules (to be discussed later) is given without comment following each derivation.</Paragraph> <Paragraph position="2"> Final result after stress --primary stress appears wer the /el and secondarj: stress , over the /a/ In the first stage, [icl is recognized as a suffix and a plus (+) is inserted to its left. Since no \other affixes are recognized, Stage 1 is terminated.</Paragraph> <Paragraph position="3"> Morph-initial Cpt) is pronouhced It/, and [ 1) and [m] are given the pronunciation= /1/ and /m/ respettively, according to the most general rule in the rule sequence for each.</Paragraph> <Paragraph position="4"> (The most general rule is the final rule in the rule sequence and contains no specified context.) In Stage 3, the contexts of the vowels lo] and [el are not among those contexts specified in the sequence of rules and are pronounced according to the final, context-independent rule. The vowel lal, on the other hand,, precedes anokher vowel and, for this reason is lengthened (tensed). The suffix [+ic] is word-final and receives the pronunciation /TIC/, In the final result, stress rules have beeg applied and unstressed non-tense vowels have been reduced.</Paragraph> </Section> <Section position="2" start_page="11" end_page="34" type="sub_section"> <SectionTitle> Generalizations and Relhted Rules </SectionTitle> <Paragraph position="0"> ) Morph-initial Lpm] and [ps] are given the pronunciations /n/ and /s/ respectively, the [ p] remaining silent as in morph-initial Lptl. 2.) Vowels in pre-vocalic position are usually lengthened (tensed). There is only one context in which [ ic] is pronounted /IS/ rather than /zk/, i.e., preceding the variable representing the vowels Cif, le3 and CYI.</Paragraph> </Section> </Section> <Section position="7" start_page="34" end_page="34" type="metho"> <SectionTitle> B. TABLE Lnput TABLE Result of Stage 1 </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> Final result after stress The,reSult of Stage 1, in this case, is the same as the input since no affixes are detected.</Paragraph> <Paragraph position="3"> The letter Lt7 is pronounced by the most general rule in its rule sequence, ahd Lb] has oniy one given pronunciation. However, [I] precedes morph-final [el and is itself preceded by another consonant, [b]. & this context, [I] is syllabic.</Paragraph> <Paragraph position="4"> The sequence [blel now forms a very specific context for the third stage. .The letter la] when followed by LCle] is lengthened if the consonant, C, is neither irl nor 111. The vowel [el is morph-final and therefore silent.</Paragraph> <Paragraph position="6"> The rules for [ bt] and [mb] d I which the Cb'J is silent are sequenced preceding the single rule for. Cb].</Paragraph> <Paragraph position="7"> All vowels except [el, if located inthe first syllable of a morph, are long if followed by C~le#] where C is neither [r] nor [I]. Examples are &quot;maple, &quot;bible ,I1 &quot;ogle&quot; and &quot;bugle.</Paragraph> <Paragraph position="9"> An exception is triple.&quot; The letter [el apears to be long In this context only if it is part of s vowel digraph, e-g., the vowel in &quot;treble&quot; is short, but the vowel digraphs in &quot;eagle,&quot; &quot;people&quot; and ''beetle'' are long. Vowels in this context which do not appear in the first syllable must be converted to short pronunciations so that they will not be given primary stress by the stress rules, e.g., &quot;monocle,&quot; &quot;'barnacle. k-derzbu Final result after stress During Stage 1, no affixes are detected. Converting consonants in Stage 2, we find that Lr'J is pronortnced according to the most general rule in its rule sequence and that [b] has only one given pronunciation. The letter LC], because it precedes la], is pronounced /k/.</Paragraph> <Paragraph position="10"> When [a] precedes tr] which, in turn, precedes either a vowel or another Lr] dthin the same morph, it usually has the pronunciation /=/. The letter (11, following its most general pronunciation, is assigned the phoneme /r/. Morph-final Lou) is given the pronunciation /u/.</Paragraph> <Paragraph position="12"> The letter Lr'J is syllabic if preceded by a consonant other than [r] and followed by a morph-fi-1 [el, (e.g?, &quot;acre&quot;), or the inPS ldctional suffixes [+s] or \+ed).</Paragraph> <Paragraph position="13"> The letter [dis palatalized in some cases, e.g., special, I I (context: [v-i~]) &quot;ancient, (context: nV It is assigned the phoneme /d/ latgr in its rule sequence i'f it is followed re], LiI or [y]. It may be noted that this? is the same context which assinns the pronunciation /rs/ to the suffix [+ic] . If [ c] is followed by [a], [o] or [u) , it is usually pronounced /k/, 'as in this example..</Paragraph> <Paragraph position="14"> 3.) -hen [a) precedes [rl and [rl is - not 4ollbred by either a vowel or another [r] within the same morph, [a1 is pronounced /a/, (e .g . , &quot;far, &quot; &quot;cartoon&quot;) unless preceded by the phoneme /w/ , e. g. , '11 11 &quot;warble, I' &quot;warp, war, &quot; &quot;wharf ,&quot; &quot;quarter&quot;) . 4. ) In a word such as macaroon,&quot; the [a] preceding [~vJ is assigned prpnunciation /x/ in the phonological rules and is reduced to schwa in the stress rules because it is unstressed. -we find that the consonant cluster [sc], like the let.ter [c], usually has the sound of /s/ preceding [el, ti] or Cy]. The letter [r] does not occur in any cohtext given in its rule sequence and is therefore given hs mosg general pronunciation. Thereis only one rule for the pronunciation of Cn] .</Paragraph> <Paragraph position="15"> Moving on to Stage 3, the vowel [el receives the pronunciatian /t/ given by its most geperal rule. The vowel [a] follows the rule given in the previous example. The yowel [o] is morph-final and has the feature [-constricted pharynx], and is lengthened acco~dingly. Because the vovel [i] precedes another vowel,, it is lengthened also.</Paragraph> <Paragraph position="16"> The consonant cluater [sc] is given the representation of a double phoneve because the informatton that it is orthographically a double consonant is needed both in the vowel rules and in the rules for) stress; It is later reduwd to a single phoneme. applied to scenario .It If [sc] precedes [i) followed by anorrler vowel, and certain letters precede [scj , a palatalization effect is nbserved. When preceded by a vowel in this context, It [sc] becomes /I/, e .g., prescience&quot; ; when preceded by an [nl,</Paragraph> <Paragraph position="18"> The pronunciation Lsc] receives in &quot;scenario&quot; is also found preceding syllabic [l] in ewmple B.</Paragraph> <Paragraph position="19"> 4.) If none of the contexts mentioned in 2.) or 3.) are found, the phonemic representation of [scl becomes /sk/ .</Paragraph> <Paragraph position="20"> 5.) The reduction of /EU/ to /T/ occurs in the stress rules.</Paragraph> <Paragraph position="22"> ssb=v 5 ;+an Final result after stress* In this example, the wffix.l+ion] and the prefix csub=] are yecognized in Stage 1.</Paragraph> <Paragraph position="23"> There is only one pronunciation provided for the consonant [v], and [rJ, because it does not fit a specified context for syllabictr] is given the standard pronunciation. The letter [s] is followed by the sequence L-kLv], making it a candidate for palatalization. The palatalization rule whfch applies assigns the phoneme /z/ In the final stage of letter-to-phoneme conversion, the affixes and vowels are considered. The prefix [sub=) has only one possible pronunciation. The letter [el, because it precedes the sequence [r~] where the consonant, C, is not an [I;], is given the pronunciation /A/. The palatal phoneme /;/ now forms a left context for -the suffix [$ion], which, being word-final, is pronounced /an/.</Paragraph> <Paragraph position="25"> Because (+s] is marked as occurring in word-f inal position orly, ihe \s] preceding [+ion] is not recognized as a suffix*. This step also prevents the [eK] preceding the is] from consideration as a possible suffix.</Paragraph> <Paragraph position="26"> When an [s] pre~~ding the sequence [+i~] or [i~] is preceded by either a vowel or an [r], it is usually pronounced /;/. Some examples are revision, artesion, &quot; &quot;Persian&quot; and &quot;dispersion&quot; ; two exceptions are &quot;controv~rsial&quot; and &quot;torsion. &quot; When [s] 2s preceded by [I], and when it occurs as part of the consonant cluster [ss] , the phoneme preceding the \vowel sequence is /f/ , It e.g., emulsion,&quot; &quot;Russian.&quot; A third pronunciation is observed Although [+ience] is a possible suffix, it is not reco~nized as such in this case because of the requirement' that at least one consodant and one vowel remain in the &quot;root.&quot; This stipulation forces the correct suffix, [-I-ence], to be recognized.</Paragraph> <Paragraph position="27"> a weak syllable, i,e, a shgrt vowel followed by no mgre than one consonant (a syllable begins ' with a vowel and terminates (a) imediately before the next vowel or (b) immediately before a formative boundary if one occurs before the next vowel a feature, e. g. , [ -long], C1 stress], or a phoneme with specified f eature(s) , e*g*y[, st:ess] either A or B or ... or P optional element; materi-a1 in parentheses is neglectell if and only if it does not correspond to context in the word under consideration -word context is compared with rule context by first comparing it with the ~aximum string in the rule, i.e., with all parentheses remd~ed,~and then by ignoring parenthesized material beginning at the inneqost parentheses. and proceeding'to the outermost parentheses aomain of rule -- formativ,e boundqries of st'ring under consideration for cycllc rules, word-boundaries fof last cycle and for non-cyclic r&les subscripts -king appearance of optional elements conditional (actual condit.ion given below rule)</Paragraph> <Paragraph position="29"> Condit'ions: (1) no stress placement to the left of a prefix boundary (2) if right-most morph is a suffix, test far special stress placement category; astign [l stress] or skip cyc-le accbrding to cateaorv.</Paragraph> <Paragraph position="30"> Conditions: (1) if ( ),is not present, ( must be present (2) not applied to the first vowel if applied to' second' vowel The stress rules which have been implemented are a modification of a set of ordered rules developed by Halle. ~odifications fall into three categories: (1) adjustments due to the corrdition that input is completely phonemic, (2 reduction of the numbeir of str~ 4s to 1 stress (primary) 2 .* stress (stress less tban primary) ~LIU 0 stress, and (3) addition of special suffiwdependent stress categories. Application of the rules proceeds in two phases. The first phase cons is.^^ of the application of three-ordered rules which are applied cyclically, first to the root, then to the root and left-most suffix,combined. The process continues with one more suffix adjoined to the string under consideration before each cycle begins until the end of the word is reached. This cyclic phase is devoted solely to the placement of primarLy stress. The second, non-cyclic- phase, includes the appli-ion -to the entire word of ordered rules and reduces all but one of the primary stress marks to secondary or zero stress.</Paragraph> <Paragraph position="31"> In the following section, stress placement rules will be given in symbolic form. Each rule which contains more than one case is brdken down lint0 cases for whicLbrief descriptions and examples are given. The rules,are listed in the order in which they apply and are marked either cyclic&quot; or 11 non-cyrlicm&quot; Particular modifications to each rule will be given at the end of the discussion about that rule under the subheading Modifications. (See Figure 3 for an explanation of notation, Figure 4 for a flow ahart of the stress rules and Figure 5 for* the complete set of stress-placement rules-in linguistic notation.</Paragraph> <Paragraph position="32"> parole p~rbl hurricane Impken (reduced to 2 stress by a later rule) Conversion of the Main Stress Rule into aigorithmic form is facilitated by ordering the above cases in the following manner: Algorithmic Order of Application: (1) If the final syllable is the only syllable, or if it consists of a long vowel followed 'by at least one consonant, the final vowel receives primary strgss. Otherwise, (2) if there are only two syllables, or if the penultimate syllable terminates in more than one consonant or if it consists of a bng vowel follmed by at least one consonant, the penultimate vowel receives primary stress. Otherwise, (3) the antepenultimate vowel receives primary stress, Modifications: The presence of the optimal vowel immegiatel~ preceding another vowel and the presence of the morph-final vowel are necessary modifications of tk Main Stress Rule due to the difficulty of retrieving the long (tense) pronunciation of a laxed vowel when its orthographic representation i,s no longer available, The Main Stress Rule, as developed by Halle, applies mly to roots which function as nouns and to suffixed forms. However, until parsing methods are further developed, it will not be possible to take advantage of known parts of speech.</Paragraph> <Paragraph position="33"> For this reason, the Main Stress Rule is currently applied to all roots.</Paragraph> <Paragraph position="34"> The suffixes referred to in Condition (2) fall into two categories. Some suffixes are marked to force stress to be placed on either the final or the penultimate syllable of the root and suffixes under consideration. This placement of stress replaces the MSR on the cycle in which the special suffix is the right-most morph. These suffixes are listed below with the phonemic representation which actually appears as input.</Paragraph> <Paragraph position="35"> The other category of suffixes referred to in Condition (2) does not affect stress; the cycle in which such a suffix is right-most in the domain is skipped. Later cycles, however, do include the suffix as part of their. domafn of application. These suffixes are listed below, and are accompanied by exqmples demonstrating their inclusion in this category. words such as chrtstendom and martyrdom do not sfipply evidence; &quot;d~m&quot; must be considered a separate syqlable, i. e., the syllable preceding &quot;om&quot; is not strong In the case of &quot;countenancing,&quot; the syllable consisting of &quot;en&quot; is generally so reduced that it is imperceptible as a syllable.</Paragraph> <Paragraph position="36"> Most woadg of four or more syllables are given alternate pronuncia~ions corresponding to the placement of MENT in either this category or in the category of regular stress glacement, e.g., stand st bend (assig~bng 1 stress to vowel which already carries 1 stress bas no effect unless the rule specifies as in the Compound Stress Rule, that the vowel inust pre'viously be 1-stressei .) Algorithmic Order of Application: (1) If the right-moqt syllable containing primary stress is the left-most syl>able in tlie word, no stress is assinried. Otherwise, (2) if the syllable preceding the right-most stressed syllable is-the only syllable preceding it, asqign primary stress to the vowel in that syllable. Otherwise, (3) if the second syllable to the left of t.he right-most stressed syllable is the left-mosg syllable, or if it terminates in more than one consonant _or consists of a long vowel followed by at least one consonant, assign primary stress to the vowel in that syllable. Otherwise, (4) the vowel in the third syllable to the left of the righf-most-stressed syllable receives stress.</Paragraph> <Paragraph position="37"> Modificatims: The ~ptional~vowel in pre-vocalic position appea'rs in the Stre-d Syllable Rulo as well as in the Main Stress Rule. Its presence prevents words such as &quot;stereobate, I' &quot;alveolate , I' and &quot;heliotrope&quot; from being stressed incorrbctly.</Paragraph> <Paragraph position="38"> The Stressed Syllable Rule, as developed by Halle, places stress on the final syllable of the non-nouns which have been excluded fromthe domain of application of the Main Stress Rule. Words for which the Categorization of nounlnon-noun amear to be most useful are those in which a one-syllable* prefix precedes a one-syllable root or bound morpheme, e. g., Cperrnit] vs.</Paragraph> <Paragraph position="40"> [PR-l,lit] , Cins~lt)~ vs. linsult] Because there are many more verbs of v v this sort than nouns, the Stressed Syllable Rule has beep mod.ified to prevent the retraction of stress into a prefix. The effect-of this modification is to produce only the verbal pronunciation of two-syllable noun/verb pairs. Another more positive, effect is the correct placement of stress in Vkrbs 1 1 1 such as edit,; inhibit and pummel. However, two-syllable nouns of the form &quot;prefix-roo,tl' which.have no verbal counterpart are stressed incorrectly, e.g. 1 1 empire, inverse. (This modification will be remaved or changed after a parsing algorithm is incorporated in the system.)</Paragraph> <Section position="1" start_page="34" end_page="34" type="sub_section"> <SectionTitle> Alternating Stress Rule (cvclic) </SectionTitle> <Paragraph position="0"> Case 1. (.Maximum string) V+ [I stress] / EX - C~WC~- \. z stress) c0 3 (a) Assign 1 stress to the vowel three syllables to thealeft of a primary-stressed vowel occurring in the last syllable if the. following syllable contains only a vowel: heliotrope (stress in last syllable later reduced) Case, P (Parenthesized stting excluded) (a) Assign 1 stress to the vowel two syllables to the left of a primary-stressed vowel occurring in the'lakt syllable: 11 1 gelinate Te1~ t'lnet (stress in first syllable later deleted; stress in last syllable later reduced) Algorithmic Order of Application: (1) If there are at least two syllables preceding a primary-stressed vowel in the last syllable of the phoneme string, and if tbe first of these two syllables is composed of more than a single vowel, place primary stre- on the vowel two syllables to the left of the vowel with primary stress.</Paragraph> <Paragraph position="1"> Otherwise, (2)*if there are at least three syllables to the left, the second of which is composed of a singl'e vowel, place primary stress on the vowel three syllables to the left of the vowel with primary stress. Otherwise, (3) no stress ~ssignment is made.</Paragraph> <Paragraph position="2"> metropolitanate which are corcectly stressed ~y the Stressed Syllable Rule are stressed incorrectly, thereafter,by the Alternating Stress Rule. Modification: The optional vowel in pre-vocalic position appears in the Alternating Stress Rule as well as. in the Main Stress and %Stressed Syllable rules.</Paragraph> <Paragraph position="3"> Proposed Modification: The restriction'of the Alternating Stress Rule to word@ in which a prefix boundary does not precede the final primary-stressed syllable could be constraided to uerbs. Such a c~nstraint would provide the correct stress placement in nouns and adjectives such as 1 2 1 2 1 2 1 2 multiform, contraband: intercept and miniskirt while retaining correct* 2 1 2- 1 2 1 stress placement in the verbs intercept, contradict-and comprehend. Such a modification would require moving thg Strong First Syllable Rule in Halle's scheme to follow the Compound Stress Rule, asgigning [2 stress] in the same context in which (1 st'ress] was previously ass'igned. This-modification has already been implemented in this program for independent reasons, and is discussed under the heading Modifications in the Strong First Syllable Rule.</Paragraph> <Paragraph position="4"> Destressing Rule (non-cyclic, applicable to all vowels having required context) Conditions: (1) if ( )ais not present; ( 1 must be present b (2) not applied to first vowel if- null is immediately followed by one consenant and s vowel which has'previously been assigned primary stress6 shorten (lax) it if ie is long, and remove ,any stress it has been assigned. (2) If a short vowel is in $he first syllable, and is immediately followed by one consonant and a vowel which has previously been assigned primary stress; and if (1) doe$ not apply to the vowel in the second tsyllable, remove any stress that has been assigned to the vowel in t,he first syllable.</Paragraph> <Paragraph position="5"> Modificakibn: The single required consonant preceding the ~rimary stressed vowel has been changed from C; (zero oz one consonane to C one consonant) so that pre-vocalic vowels are not shotcened. Compound Stress Rule (non-cyclic) This rule, as developed by Halle, applies to both compounds and non-compounds. As it applies to words converted by letter-to-phoneme rules in the program, and therefore to non-compounds only, its efPect is to locate the primary stress which is to be retained. All other primary stress is reduced to secondary. Halle has used the Nuclear Stress Rule for both phrase-level stress and the reduction of secondary to tertiary stress in lexical* ttems. Neither is necessary in this algorithm; the Nuclear Stress Rule has therefore been omitted.</Paragraph> <Paragraph position="6"> Condition: (1) Y contains no 1 stress (2) if right most morph is a suffix, check for special stress retention or stress exclusion , category and reassign [l stress] according &o category once, no changes are made. Otherwise, (2) if the right-most vowel with primary stress is followed by at least one more syLMle, the right-most of whikh is - not composed of an unstressed /i/, it retains primary stress and all other prfmary stress is reduced to secondary. (3) If the right-most vowel+ with primary stress is (a) the rigfit-most vowel in the word, or (b) the right-most vowel with the exception of a final syllable chpased of an unstressed /i/, the first primary-stressed vowel to its left retains primary stress and all other primary stress is reduced to secondary. Modifications: As mentioned previously, input to the stress rules from the letter-to-phoneme program do~s not include compounds. The part of-he rde designed for compounds is, the~efore, omitted.</Paragraph> <Paragraph position="7"> This rule formerly contained the letter Cy] instead of has been substituted due to unavailability of the original brthogsaphy. The suffixes referred to in Condition (2) fall into two categories. Those suffixes discussed under Condition (2) of the Main Stress Rule which do not affect stress placement are excepted from the Bamain of the Compound Stress Rule if they are either word-final or precede another word-final suffix in the same category.</Paragraph> <Paragraph position="8"> The other category of suffixes is marked for special stress retention. The fellowing suffixes retain primary stress in word-final position under Condition (2) of the Compound Stress Rule: Note: Th&s categorization is equivalent to the statement that syllabic M does not function as a syllable .in morph-final position. The same stress pattern appears in wordp ending in [ithm], although it is not included here as a sbffix, e.g., 12 1 2 logarithm, algorithm.</Paragraph> <Paragraph position="9"> The same categorization should be extended to morph-final syllabic 113. However, it does not function as a suffix, e.g- 9 corpuscle The original set of stress rules included the Trisyllabic Shortening Rule at this point in the ordering. The rule was stated as follows: Condition: (1) does not apply to /u/ Test results indi,cated misp~onunciations arisYkng from its application. A study was undertaken to determine the usefulness of this rt~lp. and to uncover problem areas which might lead to a more proper resolution of observed effects for which the Trisyllabic Shortening Rule was formulated. It: was found that a restatement of phonological rules, including the requirement of a short vowel in a one-syllable root preceding a Single consonant, and certain suffixes, obviated the need for the Trisyllabic Shorte'ning Rule in the set of stress rules.</Paragraph> <Paragraph position="10"> it contains either a long vowel ot two or more consonants, assign the vowel. primary stress.</Paragraph> <Paragraph position="11"> Modifications: This rule has been extended to include both the-first syllable of the root and the first syllable of the left-most prefix. This rule has been moved to follow the Compound Stress Rule to prevent the retentipn of primary stress in prefixes by the Compound Stress Rule in words such as recruit and intend.</Paragraph> <Paragraph position="12"> Cursory Rule (non-cyclic) Conditibn: (1) if right-most morph is a suffix, check for stress exclusion category, Algorithmic Application: (only one case of the Cursory Rule) The vowel following the primary-stressed vowel, if it is not the last vowel in the word, is shortened and its stress removed.</Paragraph> <Paragraph position="13"> 2 1 X'O 1 AH0 lPO Examples : infirmary , cursory-, curative (/e/-+ 1 a~ 1 ,- later reduced tb /a /.) Modifications: Pre-vocalic vowels are not shortened.</Paragraph> <Paragraph position="14"> The suffixes discussed under the Main Sttess Ruleahi-ch do not affecL stress placement are excepted from the domain of the Cursdry Rule if they are either. word-final or precede another, word-final suffix in the same cate- null reddced,. /E/ and /I/ to /5/, i.e., redaced I, and all others to /a/. Modification: The phonemes /EU/ and /T/ are reduced Go /I/ rather than to' /a/.</Paragraph> <Paragraph position="15"> A Stress-Depe'hden~ Letter-to-Phoneme Rule The rule which follows appears to be stress-dependent and was placed in the stress placement section rather than with other letter-to-phoneme rules : Rule: The phoneme it/ is changed to /$/ and the phoneme /d/ to /J/ if it is not in the initial consonant cluster and precedes unstressed /u/ or /U/, or if it precedestlnstressed /a/ which was /u/ or /U/ before application of stress placement rules.</Paragraph> <Paragraph position="16"> (c) institution, - centurion, - Hindu, - constitute - null In the above cases, /t/ or /d/ is not in the initial eonsonant cluster and precedes stressed /u/ or /U /.</Paragraph> <Paragraph position="17"> The stress program has been modified to effect this change. The phonemes ft/ and /d/ preceding unstressed /u/ or /U / not in the first syllable are chgnged following the cyclic rules which place all stress. After the Destressing Rule and the Cursory Rule, a change is also made if the destressed (and possibly 'shortened) vowel was previously a /u/ or /u / and not in the first syllable.</Paragraph> </Section> </Section> class="xml-element"></Paper>