File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-0901_metho.xml
Size: 18,721 bytes
Last Modified: 2025-10-06 14:15:07
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-0901"> <Title>Generating Interlanguage Syllabification in Optimality Theory&quot;</Title> <Section position="3" start_page="1" end_page="4" type="metho"> <SectionTitle> 2. IL Phonology of Korean Speakers of English \[IL-K-E\] </SectionTitle> <Paragraph position="0"> Korean has three distinctive types of voiceless stops as phonemes: aspirated/ph, t h, kn/, fortis/p*, t*, k*/, and lenis /p, t, k/. Voiced stops do not exist as phonemes but exist as allophones, because lenis stops become voiced between two (voiced) sonorant sounds. Thus, unlike English, aspiration is a phonemic feature and voicing in stops is an allophonic one in Korean. Many Koreans tend to accept English voiceless stops as Korean aspirated stops, so they pronounce school as \[stukhul\]. In this project, the input of an English voiceless stop is regarded as an aspirated stop.</Paragraph> <Paragraph position="1"> There are three salient features of Korean syllable structure. First, consonant clusters are not allowed. When speaking an English word with consonant clusters, many Koreans tend to insert a vowel as shown below: (1) a. school \[sw.khul\] b. mint \[rain.thin\] In English, *COM should be lower-ranked than MAX or DEP.</Paragraph> <Paragraph position="2"> In IL-K-E, vowel epenthesis occurs not only in consonant clusters but also in syllable-final fricatives or affricates or even stops preceded by a diphthong (Ahn, 1991; H-B Park, 1992; Broselow & H-B Park, 1995; Tak, 1996) as shown below: (6) a. kiss \[kais_m_\] b. push \[p~uJ'_i\] c. tight \[thaithw\].</Paragraph> <Paragraph position="3"> This is related to the second feature of Korean syllable structure that only seven consonants \[p, t, k, m, n, q, 11 can occur in the coda position in Korean. Labial and velar stops are neutralized as a homorganic lenis stop (/ph, p., p/ ~ \[p\]; /k h, k*, k/ ~ \[k\]), and all coronal obstruents as \[t\] (It h, t*, t, t~ h, ~*, t~, s, s*/~ \[t\]). To deal with this coda neutralization phenomenon, I propose the following Feature Alignment Constraints, revising Hong (1997): 2 Let us see how such IL pronunciation is obtained in terms of OT. First of all, the following OT constraints are to be considered: (2) *.'_COMPLEX \[*COMI (P&S) No more than one C or V may associate to any syllable position node.</Paragraph> <Paragraph position="4"> (3) MAX (McCarthy, 1995a) Every element of S 1 has a correspondent in $2. (no deletion of a segment) (7) a. Align-Left (\[stiff vocal folds\], or) ~\[A-L(svf, or)\] b. Align-Left (\[+continuant\], ~) \[A-L(cont, cr)\] These constraints force a segment with the corresponding feature in the syllable-initial position. Like Hong, I also adopt IDENT-IO IF\] constraints, which also belong to the Correspondence Theory family.</Paragraph> <Paragraph position="5"> (4) DEP (McCarthy, 1995a) Every element of $2 has a correspondent in S 1. (no insertion of a segment) The constraint ranking seems to be {*COM, This is based on Lombardi's (1995a,b) proposal of laryngeal neutralization. That is, laryngeal features such as aspiration or voicing appears only in the syllable-initial position. However, the scope of Korean coda neutralization is not limited only to laryngeal neutralization, since alveolar fricatives and alveopalatal affricates are also neutralized as a plain lenis alveolar stop It/ as described above. Furthermore, a voiced stop can occur in an ambisyllabic coda position by means of Lenis Stop Voicing as shown in (19) later. (9) as depicted below: (10) os 'clothes' in Korean</Paragraph> <Paragraph position="7"> Since \[os\] and lost.u\] violate higher-ranked A-L (cont, o) and DEP, respectively, lot\] is judged optimal even if it violates ID\[cont\]. However, as shown in (6), English kiss is not pronounced as \[kit\] but \[kistu\], which means ID\[cont\] should outrank DEP in IL-K-E. On the other hand, top is usually pronounced as \[thap\] not as \[#aphtu\], which means DEP outranks ID\[svf\] as shown in (11). That is, the modification of \[continuant\] feature is severer than that of \[svf\] feature.</Paragraph> <Paragraph position="8"> (11) kiss and top in IL-K-E</Paragraph> <Paragraph position="10"> The third feature of Korean syllable structure is that increasing sonority across the syllable boundaries is disfavored. An obstruent before a nasal, for example, cannot be preserved due to the increasing sonority, but changes into a homorganic nasal with the same sonority as shown below: Avoid rising sonority across the syllable boundaries.</Paragraph> <Paragraph position="11"> The selection of the output form in (12-a) can be depicted as follows: (14) os.man 'clothes only' in Korean</Paragraph> <Paragraph position="13"> are eliminated due to the violation of higher-ranked SCC. Candidates (c, d, e) satisfy both SCC and A-L(cont, o), and violate ID\[cont\], ID\[son\] and ID\[vd\]. However, candidates (d, e) violate ID\[lat\] and ID\[place\] respectively, which candidate (c) does not violate. So candidate (c) is selected.</Paragraph> <Paragraph position="14"> When Koreans transfer this obstruent nasalization phenomenon to English, pick me up is pronounced as \[phiq.mi.^p\] and big mouse as \[piq.ma.u.stu\]. Resyllabification is also related to 3 This corresponds to Murray & Vennemann (1983) and Vennemann's (1988) Syllable Contact Law, which was based on Hooper (1976). Davis & Shin (1997) propose such a constraint and Hong (1997) adopts it.</Paragraph> <Paragraph position="15"> SCC. 4 If a nominal particle, '-i', is attached to os (10), the neutralization does not occur, since Is/is resyllabified as the onset of the following syllable as in os-i \[o.si\]. s</Paragraph> <Paragraph position="17"> To satisfy a higher-ranked constraint SCC, the C in VCV sequences in a prosodic word must be syllabified as an onset of the second vowel as in (c) and (d). Between the two candidates, (c) is judged optimal since it satisfies the ID\[cont\] constraint, too, while (d) violates it. However, a compound word wus.os /us.os/ 'upper garment' is not pronounced as \[u.sot\] but as \[udot\]. This indicates that the coda/s/belonging to the first word wus is neutralized as/t/and this lenis stop becomes voiced between two vowels. To avoid misjudging \[u.sot\] as optimal, Align-Left (7) is clarified as (16) and another Align constraint such as (17) is proposed, following Hong (1997).</Paragraph> <Paragraph position="18"> (16) a. Crisp-Align-Left (\[stiff vocal folds\], o) CA-L(svf, a) b. Crisp-Align-Left(\[+continuant\], o) CA-L(cont, o) (17) Non-Crisp-Align-Right (Rooto m~, PrWd) * --~ NCA-R(Rt, PW) Crisp Alignment does not allow ambisyllabicity, while Non-Crisp Alignment allows it (It6 & 4 There are other phonological phenomena related to the SCC such as lateralization, delateralization,/n/insertion, Ill-insertion, etc. However, I will not deal with them in this paper.</Paragraph> <Paragraph position="19"> s In fact, Is/ becomes palatalized before a high front vocoid. But I skip this phenomenon in this paper. The other issue is that ONSET can also play a role for triggering resyllabification. However, SCC covers the role of ONSET, i.e., ONSET can be regarded as a subset of SCC.</Paragraph> <Paragraph position="20"> Mester,1994). Accordingly, (16) does not allow aspirated stops, fortis stops, fricatives, or affricates to occur in the coda position whether they are ambisyllabic or not. On the other hand, (17) allows the last element of the root word to become ambisyllabic, but does not allow it to be disconnected from the original word.</Paragraph> <Paragraph position="21"> Proposing a Voice constraint (18) to deal with Korean Lenis Stop Voicing phenomenon, let us consider how to syllabify wus.os.</Paragraph> <Paragraph position="22"> (18) VOICE \[VCE1 Stops with a \[-stiff vocal folds\] (i.e., nonaspirated or non-fortis) feature are realized as voiced between two sonorant sounds within an accentual phrase, and as voiceless elsewhere.</Paragraph> <Paragraph position="23"> only the lower-ranked ID\[cont\] and ID\[vd\], which are compelled to satisfy a higher-ranked CA-L (cont, a) constraint. Candidate (b) is eliminated due to the violation of a higher-ranked VCE. Candidate (c) violates CA-L(cont, a) since Is/ should not be a coda anyway, and is eliminated. Candidates (d, e) are eliminated due to the violation of higher-ranked NCA-R(Rt, PW). The word wus is a root and also a prosodic word by itself, so its final element Is/ should not belong to another word. Candidates (f, g) are eliminated, since they violate another highly-ranked SCC. The transfer of the coda neutralization and lenis stop voicing phenomena to English may result in the pronunciation of stop it as \[smthabit\]. The following tableau shows how it works (Note: ambisyllabic C is represented as &quot;'C'&quot;): (20) pick up in IL-K-E</Paragraph> <Paragraph position="25"> : ,vi * ! .1: b. sm.thaph.it c. sm.thap.it d. sm.thab.it e. sm.tha.pNj f. stu.tha.p it g. sm.tha.b it</Paragraph> <Paragraph position="27"> Candidate (a) is eliminated due to the violation of a higher-ranked *COM. All other candidates violate a lower-ranked DEP to satisfy *COM.</Paragraph> <Paragraph position="28"> Candidates (b, c, d) and (e, f, g) are cast out due to the violation of higher-ranked SCC and NCA-R(Rt, PW), respectively. Candidates (h, i) are eliminated due to the violation of CA-L(svf, a) and VCE, respectively. Candidate (j) is selected even if it violates ID\[vd\], DEP and ID\[svf\], which are lower-ranked. The reason why ID\[vd\] is considered to outrank DEP is due to the observation that Koreans tend to insert a vowel after a voiced stop even preceded by a lax vowel.</Paragraph> <Paragraph position="29"> That is, sad may be pronounced as \[sedm\] rather than as \[set\], where the former violates DEP but the latter violates ID\[vd\]. 6 6 However, some words like good usually do not To sum up, the constraint ranking in IL-K-E considered up to now is as follows: (21) Constraint Ranking in IL-K-E</Paragraph> <Section position="1" start_page="4" end_page="4" type="sub_section"> <SectionTitle> 3.1 Problems </SectionTitle> <Paragraph position="0"> According to Hammond (1997b), the greatest problem in OT-based implementation is the possibility of the infinite candidate set when epenthesis (violation of DEP) or deletion (violation of MAX) are allowed, since Gen can produce infinitely any candidates. Even if there is no epenthesis or deletion, assuming that any segment can be syllabified as an onset, peak, coda or unparsed element, a word with n elements may have 4&quot; possible syllabifications, which is an exponential problem. In addition, each candidate has to be tested by each constraint. That is, the combination of the number of candidates times the number of constraints must be considered, which is an arithmetic but still nontrivial problem.</Paragraph> <Paragraph position="1"> To solve these problems, he proposes: I) implementation of syllabification is made by a form of a parser, which does not need to consider epenthesis nor deletion; 2) syllabification is encoded locally; and 3) a cyclic CON-EVAL loop is applied constraint by constraint.</Paragraph> <Paragraph position="2"> The problems of implementation of IL-K-E in OT are more complicated than those raised by Hammond, since ambisyllabicity, epenthesis, and segment modification should be considered. The allow epenthesis, even if the final segment is a voiced stop. More experimental research is required on this issue, and I will skip this in this paper.</Paragraph> <Paragraph position="3"> system dealing with such syllabification cannot be a parser but a generator. 7</Paragraph> </Section> <Section position="2" start_page="4" end_page="4" type="sub_section"> <SectionTitle> 3.2 Korean accented English Generator </SectionTitle> <Paragraph position="0"> I assume that the initial candidate set produced by Gen can be predictable and finite, following the previous researches (Ellison, 1994; Tesar, 1995; Eisner, 1997; Hammond, 1995, 1997b). I adopt the concept of local encoding (Hammond, 1995, 1997b) developed from the concept of finite state automata (Ellison, 1994) and that of dynamic programming (Tesar, 1995). Unlike Hammond, however, since the role of this generator is not only syllabifying the input segments but also modifying them into suitable output segments, I suppose there are two templates of candidate grids: one representing syllable positions, and the other representing potential segment output forms.</Paragraph> <Paragraph position="1"> Supposing the input is a phrase like stop it, whose string of phonemes is/s t h a ph # i th/, the grids look like below:</Paragraph> </Section> </Section> <Section position="4" start_page="4" end_page="4" type="metho"> <SectionTitle> 7 Hammond differentiates, &quot;The generator would </SectionTitle> <Paragraph position="0"> start with an input form, generate candidate syllabifications, and apply constraints to produce a syllabified output. A parser would start with an unsyllabified output, generate candidates, and produce a syllabified output&quot; (p.6).</Paragraph> <Paragraph position="1"> s In producing candidates for segments, only those which may become optimal forms in some situations are considered. The number of candidates for</Paragraph> <Paragraph position="3"> s w t n w ph w t h w As shown in (22), each segment can be an onset, a nucleus, two nuclei, 9 a coda, an ambisyllabic coda-onset, an onset + a nucleus (due to epenthesis), or an unparsed segment. Deletion is not considered in the current project. Constraints (cyclic CON-EVAL) prune away disfavored candidates cyclically. If there is only one member left in the candidate set, it should not be pruned away by any constraint. Grid A (22) is treated first, and then Grid B (23) is treated.</Paragraph> <Paragraph position="4"> There are constraints such as NUC requiring a nucleus in a syllable, and *MARGIN/V saying a vowel cannot be an onset or a coda, and *PEAK/C saying a consonant cannot be a nucleus. So, if a segment is a vowel, all candidates but 'nn' (for a diphthong) or 'n' (for another vowel) are removed, and if a consonant, 'n' and 'nn' is removed as shown in (24) (the removable candidate is italicized and underlined): (24) NUC, *MARGIN/V, *PEAK/C s t&quot; a p&quot; # i t&quot; o o _o o o p_ o n_ _n n n_ n n _n n n n_.n_ nn nn nn nn nn c c _c c c g c co co c._.Q, co co c._..Q co on on o_v_ on on o__v_ on u u _u u u _u I u segments depends on the types of segments. For example, a voiceless stop has five candidates: aspirated one, unaspirated one, voiced one, nasal one, and epenthesized one, while a nasal has only one candidate, itself.</Paragraph> <Paragraph position="5"> 9 Many Koreans tend to regard an English 6dipthong like/arJ as two distinctive vowels like/a i/. I adopt Hammond's idea of housekeeping, too, and propose the following cases: (25) Housekeeping a. word-initial coda and coda-onset b. word-final onset c. phrase-final coda-onset d. word-final coda before another word starting with a vowel e. no parsing of word boundary A word cannot start with 'c' or 'co' (a), and cannot end with 'o' (b). 'co' can occur in the word final position, but not in the phrase final position (c). A 'c' in a word-final position is deleted if it follows by '#' and 'n' (due to the SCC(13)) (d). A word boundary '#' has only 'u' (e).</Paragraph> <Paragraph position="6"> (26) Housekeeping s t h a ph # i t n 0 0 _0 b 0 e .0. b n t! e n C a C _C a C e C C._O a CO CO C/_.O e C_.O c on t on on Oi._.! e on I u lu u u u Every segment is considered to be parsed, i.e., no deletion is considered in the current system, So 'u' is deleted in each set. It is better to delete the portion of the word boundary symbol, before *COM starts to work. *COM does not allow a sequence of 'o + o', or 'c + c'. So delete 'o' or 'co' preceded by 'o', and 'c' or 'co' followed by 'c'. Ideg Since a vowel is usually epenthesized after a fricative or an affricate in the coda position, 'c' or 'co' under such a segment should be deleted by means of CA-L(cont, o). For the current example, however, this application is vacuous. the same as (29) Now, DEP plays a role of pruning 'o n' candidate in a set containing more than one element.</Paragraph> <Paragraph position="7"> 10 Not only is an English diphthong regarded as two vowels (cf. footnote 9), but also an obstruent followed by it tends to be epenthesized (cf. 6-c), so that 'c' and 'co' in this position may be deleted. I think this phenomenon is also related to *COM. For the current example, however, this is not applicable. l on o___~ on, be checked. The voiceness of each segment checked by using feature geometry.</Paragraph> <Paragraph position="8"> (34) VOICE s t h a ph i t h o n o n co n c</Paragraph> <Paragraph position="10"> At last, the optimal set of Grid A candidates is determined, which is dispatched to Grid B (23).</Paragraph> <Paragraph position="11"> Before applying constraints, one-to-one matching occurs. That is, if the syllable position needs two segments, i.e., if it is 'o n', candidates such as 't h m' with epenthesized vowel will be selected; if it is not 'o n', candidates such as 't h m' will be removed.</Paragraph> <Paragraph position="12"> which has a different \[F\] feature from that of input as shown in (33) and (34): SCC checks the sequence of 'c' and 'o', and compare the sonority degree of the segments, usually for nasalization. Here, it is not applicable. CA-L(svf, o) deletes an aspirated stop under 'c' or 'co' as shown below:</Paragraph> <Section position="1" start_page="4" end_page="4" type="sub_section"> <SectionTitle> 3.4 Contribution and Future Work </SectionTitle> <Paragraph position="0"> The current work is significantin that it analyzes and implements the generation of the syllabification of an IL, ILdegK-E, in OT. It tries to generate not only syllable positions but also modified output segments.</Paragraph> <Paragraph position="1"> There are some Korean phonological phenomena, which are not considered at the current system. They are palatalization, Ill-/r/ alternation, etc. Next step is dealing with these phenomena.</Paragraph> <Paragraph position="2"> IL among L2 learners must be different according to the learners, and it is not always the same even in the same person. The current system produces only one type of output based on the transfer of some Korean phonology. Further efforts will be made to generate several possible IL pronunciations according to the different levels of proficiency.</Paragraph> </Section> </Section> class="xml-element"></Paper>