File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/89/p89-1011_metho.xml

Size: 18,010 bytes

Last Modified: 2025-10-06 14:12:25

<?xml version="1.0" standalone="yes"?>
<Paper uid="P89-1011">
  <Title>Table I</Title>
  <Section position="4" start_page="84" end_page="85" type="metho">
    <SectionTitle>
THE PRE-LEXICAL
PHONOLOGICAL REPRESENTATION
</SectionTitle>
    <Paragraph position="0"> Several researchers have argued that phonological processes, such as the palatallsation of/d/in (1), create problems for the word recognition sysmn because they 'distort' the phonological form of the word. Church (1987) and Frazier (1987) argue persuasively that, far fxom creating problems, such phonological processes provide imporu~ clues to the correct syllabic segmentation of the input and thus, to the locadon of word bounderies. However, this argument only goes through on ~ assump6on that quire derailed 'narrow' phonetic information is recovered from the signal, such as aspiration of M in/rE/ and /tam/ in (1) in order m recoguise tim preceding syllable botmdsrles. It is only in. terms of this represer~,tion that phonological processes c~m be recoguised and their effects 'undone' in order to allow correct matching of the input against the canonical phonological represenU~ons contained in lexical entries.</Paragraph>
    <Paragraph position="1"> Other researchers (e.g. Shipman &amp; Zne, 1982)have argued (in the context of isolated word recogu/tion) that the initial representation which contacts the lexicon should be a broad mmmer-class transcription of the stressed syllables in the speech signal. The evidence in favot~ of this approach is, firstly, that extraction of more detailed information is nouniously diffic~dt and, secondly, that a broad transcription of this type appears to be vexy effective in partit/oning the English lexicon into small cohom. For example, Huttenlocher (1985) reports an average cohort size of 21 words for a 20,000 word lexicon using a six-camgory manner of articulation transcription scheme (employing the categories: Stop, Strong-Fricative, Weak-Fricative, Nasal, Glide-Liquid, and Vowel).</Paragraph>
    <Paragraph position="2"> This claim suggests that the English lexicon is functionally organised to favour a system which initiates lex/cal access from a broad manner class pre-lexical representation, because most of the discriminatory iv.formation between different words is concentra~i in the manner articulation of stressed syllables. Elsewhere, we have argued that these ideas are mis|-~d;_ngly presented and that there is, in fact, no significant advantage for manner information in suessed syllables (e.g. Carter et al., 1987; Caner, 1987, 1989). We found that there is no advantage per s~ to a manner class analysis of stressed syllables, since a similar malysis of unstressed syllables is as discriminatory and yields as good a partitioning of the English lexicon. However, concantrating on a full phonemic malysis of stressed syllables provides about 10% more information them a similer analysis of tmstressed syllables. This research suggests, then, that the pre-lexical represenw.ion used to initiate lexical access can only afford m concentram exclusively on stressed syllables ff these are analysed (at least) phonemically. None of these studies consider the extracud~ility of the classifications fxom speech input however, whilst there is a g~m~ral belief that it is easier to extract infonnation from stressed portions of the signal, the~ is little reason to believe that mariner class infm'mation is, in general, more or less accessible than other phonologically relevant features.</Paragraph>
    <Paragraph position="3"> A second argument which can be made against the use of broad represmUstions to contact the lexicon (in the context of conn~ speech) is that such representations will not support the phonological parsing n~essary to 'undo&amp;quot; such processes as palatallsation. For example, in (1) the final/d/of d/d will be realised as/j/ and camgurised as a sarong-fricative followed by liquid-glide using the proposed broad manner ~ransoripfion. Therefore. palamlisadon will need m be recoguised before the required stop-vowel-stop represenr~ion can be recovered and used to initiate lexical access. However, applying such phonological rules in a constrained and useful manner requires a more detailed input transcription. Palamllsation inustra~es this point very cle~ly; not all sequences which will be transcribed as strong-fl'lcative followed by liquid-glide can undergo this process by any means (e.g. /81/), but there will be no way of preventing the rule oven-applying in many inappropriate conmxts and thus presumably leading to the get.ration of many spurious word candidates.</Paragraph>
    <Paragraph position="4">  A third argument against the use of exclusively broad representations is that these representations will not support the effective recognition of syllableboundaries and some word-boundaries on the basis of phonotactic and other phonological sequencing constraints. For example, Church (1987) proposes an initial syllabification of the input as a prerequisite to l~dcal access, but his sylla &amp;quot;bificafion of the speech input exploits phonotactic constraints and relies on the extraction of allophonic features, such as aspiration, to guide this process. Similarly, Harringmn et al. (1988) argue that approximately 45% of word boundaries are, in principle, recognisable because they occur in phoneme sequences which are rare or forbidden word-internally.</Paragraph>
    <Paragraph position="5"> However, exploitation of these English phonological constraints would be considerably impaired if the pre-lexical representation of the input is restricted to a broad classification.</Paragraph>
    <Paragraph position="6"> h might seem self-evident that people are able to recognise phonemes in speech, but in fact the psychological evidence suggests that this ability is mediated by the output of the word recognition process rather than being an essential prerequisite to its success. Phoneme-monimrin 8 experiments, in which subjects listen for specified phonemes in speech, are sensitive to lexical effects such as word frequency, semmfic association, and so forth (see Cutler et al., 1987 for a summary of the expemnen~ literature and putative explmation of the effect), suggesting that information concemm 8 at least some of the phonetic contain of a word is not available until after the word is recoguised.</Paragraph>
    <Paragraph position="7"> Thus, people's ability to recognise phonemes tells us very little about the nann~ of the representation used to initiate lexical access. Better (but still indireoO evidence comes from mispronunciation monitoring and phoneme confusion experiments (Cole, 1973; Miller &amp; Nicely, 1955; Sheperd, 1972) which suggest that tlsteners eere likdy to confuse or ~ phonemes along the dimensions predicted by distinctive feature theory. Most e~rcn result in reporting phonemes which differ in only one feanu~ from the target, This result suggests that listenexs are actively considering detailed phonetic information along a munber of dimemions (rather than simply, say, manner of articulation).</Paragraph>
    <Paragraph position="8"> Theoretical and experimental considerations suggest then that, regardless of the current capabilities of automated acoustic-phonetic fxont-ends, sysmms must be developed to extract as phonetically detailed a pm-lexical phonological represemation as possible. Without such a representation, phonological processes cannot be effectively recoguL~i and compensated for in the word recognition process and the 'extra' information conveyed in stressed syllables cannot be exploited. Nevertheless in fluent connected speech, unstressed syllables often undergo phonological processes which render them highly indemmlinam; for example, the vowel reductions in (I). Therefore, it is implausible m assume that my (human or machine) front-end will always output an accurate narrow phonetic, phonemic of perhaps even broad (say, manner class) mmscription of the speech input. For this reason, fur~er processes involved in lexical access will need to function effectively despim the very variable quality of information extracted from the speech signal.</Paragraph>
    <Paragraph position="9"> This last point creates a serious difficulty for the design of effective phonological parsers. Church (1987), for example, allows himself the idealisation of an accurate 'nsrmw' phonetic transcription. It remains to be demonstramd that any parsing mclmiques developed for determlnam symbolic input will transfer effectively to real speech input (and such a test may have to await considerably better automated front-ends). For the purposes of the next section. I assume that some such account of phonological parsing can be developed and that the pre-lexical representation used to initiate lexical access is one in which phonological processes have been 'undone' in order to consuuct a representation close to the canonical (phonemic) representation of a word's pronunciation. However, I do not assume that this representation will necessarily be accuram to the same degree of detail throughout the input.</Paragraph>
  </Section>
  <Section position="5" start_page="85" end_page="87" type="metho">
    <SectionTitle>
LEXICAL ACCESS STRATEGIES
</SectionTitle>
    <Paragraph position="0"> Any theory of word recognition must provide a mechanism for the segmentation of connected speech into words. In effect, the theory must explain how the process of lexical access is triggered at appropriate points in the speech signal in the absence of completely reliable phonetic/phonological cues to word boundaries.</Paragraph>
    <Paragraph position="1"> The various theories of lexical access and word recognition in conneomd speech propose mechanisms which appear to cover the full specumm of logical possibilities. Klan (1979) suggests that lexicai access is triggered off each successive spectral frame derived from the signal (i.e. approximately every 5 msecs.), McClelland &amp; Elman (1986) suggest each successive phoneme, Church (1987) suggests each syllable onset, Grosjean &amp; Gee (1987) suggest each stressed syllable onset, aud Curler &amp; Norris (1985) suggest each pmsodiceliy smmg syllable onset. Finally, Maralan-Wilson &amp; Welsh (1978) suggest that segmentation of the speech input and recognition of word boundaries is an indivisible process in which the endpoint of the previous word defines the point at which lexical access is Iriggered again.</Paragraph>
    <Paragraph position="2"> Some of these access strategies have been evaluated with respect to three input transcriptions (which are plausible candidates for the pre-lexical represen~uion on the basis of the work discussed in the previous section) in the context of a realistic sized lexicon. The experiment involved one sentence taken from a reading of the 'Rainbow passage' which had been analysed by several phoneticians for independent purposes. This sentence is reproduced in (2a) with the syllables which were judged to be strong by the phoneticians underlined.</Paragraph>
    <Paragraph position="3">  This utterance was transcribed: 1) fine class, using phonemic U-ensoription throughout; 2) mid class, using phonemic transcription of strong syllables and a sixcategory intoner of articulation tranm'ipdon of weak syllables; 3) broad class, as mid class but suppressing voicing disK, ations in the strong syllable transcriptions. (2b) gives the mid class transcription of the utterance. In this transcription, phonemes are represented in a manner compatible with the scheme employed in the Longman Dictionary of Contonporary English and the manner class categories in capitals are Stop, Strong-Fricative, Weak-Fricative, Nasal, Glide-liquid, end Vowel as in Hunmlocher (1982) end elsewhe=e. The terms, fine, mid end broad, for each transcription scheme are intended purely descriptively and are not necessarily related to other uses of these terms in the literature. Each of the schemes is intended to represent a possible behaviour of an acoustic-phonetic front-end. The less determinate transoriptions can be viewed either as the result of transcription errors and indatermlnacies or as the output of a less ambitious front-end design. The definition of syllable boundary employed is, of necessity, that built into the syllable parser which acts as the interface to the dictionary d~t-_bese (e.g. Carter, 1989). The parser syllabifies phonemic Iranscriptions according to the phonotactiz constraints given in Ghnson (1980) emd utilis~ the maximal onset principle (Selkirk, 1978) where this leads to ambiguity.</Paragraph>
    <Paragraph position="4"> Each of the three transcriptions was used as a putative pre-lexical representation to test some of the different access slrategies, which were used to initiate lexieal look-up into the dictionary database. The four access strategies which were tested were: 1) phoneme, using each mr..eessive phoneme to trigger an access amnnp~ 2) word. using the offset of the previous (correct) word in the input to control access attempts; 3) syllable, attempting look-up at each syllable boundary; 4) strong syllable, attemptin 8 look-up at earh strong syllable boundary. That is, the first smuegy assumes a word may begin at any p*'umeme boendary, the second that a word may only begin, at tlm end of the previous one, the third that a word may begin at any syllable boundary, end the fourth that a word may begin at a seron 8 syllable boundary.</Paragraph>
    <Paragraph position="5"> The strong syllable strategy uses a separate look-up process for typically urmtreimad grammatical, clor, ad-clus vocabulary end allows the possibility of extending look-up 'backwards' over one preceding weak syllable. It was assumed, for the purposes of the experiment, that look-up off weak syllables would be restricted to closed-class vocabulary, would not extend into a strong syllable, and that this process would precede attempts to incorporate a weak syllable *backwards' into an open-class word.</Paragraph>
    <Paragraph position="6"> The direct access approach was not considered because of its implausibility in the light of the discussion in the previous section. The stressed syllable account is v=y slmilar to the strong syllable approach, but given the problem of stress shift in fluent speech, a formulation in unms of strong syllables, which are defined in terms of the absence of vowel reduction, is preferable.</Paragraph>
    <Paragraph position="7"> Work by Marslen-Wilson and his colleagues (e.g.</Paragraph>
    <Paragraph position="8"> Marslen-Wilson &amp; Warren. 1987) suggests that, whatever access strategy is used, there is no delay in the availability of information derived fi'om the speech signal to furth= select from the cohort of word candidates. This suggests that s model in which units (say syllables) of the pre-lexical representation are 'pre-packaged' and then used to wlgser a look-up attempt are implausible. Rathe~ the look-up process must involve the continuous integration of information from the pre-lexical representation immediately it becomes available. Thus the question of access strategy concerns only the points at which this look-up process is initiated.</Paragraph>
    <Paragraph position="9"> In order to simulate the continuous aspect of lexlcel access using the dictionary database, d~:__M3_ase look-up queries for each strategy were initiated using the two phonemes/segments Horn the trigger point and then again with three phonemes/segmonts and so on until no hu~er English words in the database were compatible with the look-up query (except for closed-class access with the strong syllable strategy where a strong syllable boundary terminated the sequence of accesses). The size of the resulting cohorts was measured for each successively larger query;, for example, using a fine class transcription and triggering access from the /r/ of rainbow yields an initial cohort of 89 cmdidams compatible with/re//. This cohort drops to 12 words when /n/ is added and to 1 word when /b/ is also included and finally goes to 0 when the vowel of/s is -dO,'d= Each sequence of queries of this type which all begin at the same point in the signal will be refened to as an access path. The differ, tee between the access strategies is mostly in the number of distinct access paths they generate.</Paragraph>
    <Paragraph position="10"> Simulating access attempts using the dictionary d~tnbasc involves generating database queries consisting of partial phonological representatious which return sere of words and enlries which satisfy the query. For example, Figure 1 relxesents the query corresponding to the complete broad-class trenscription of appoint. This qu=y matches 37 word forms in the database.</Paragraph>
    <Paragraph position="11">  The ex~riment involved 8enera~8 s~uen~ of queries of this type and recording the number of words found in the database which matched each query. Figure 2 shows the partial word lattice for the mid class trauscription of th, e ra/nbow /s. using the strong syllable access strategy. In this lattice access paths involving r~o'~sively larger portions of the signal are illustrated. The m=nber under each access attempt represents the size of the set of words whose phonology is compatible  with the query. Lines preceded by an arrow indicate a query which forms part of an access path, adding a further segment to the query above it.</Paragraph>
    <Paragraph position="12">  The corresponding complete word lattice for the same portion of input using a mid-class tr~cription and the strong syllable strategy is shown in Figure 3. In this lattice, only words whose complete phonology is compatible with the input are shown.</Paragraph>
    <Paragraph position="13">  The different strategies ware evaluated relative to the 3 trensc6ption schemes by summing the total number of partial words matched for the test scmtence under each strategy and trans=ipdon and also by looking at the total number of complete words matched.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML