File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/79/j79-1060_concl.xml

Size: 10,474 bytes

Last Modified: 2025-10-06 13:55:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="J79-1060">
  <Title>PITCH CONTOUR GEIIERATION IN SPEECH SYNTHESIS A JUNCTION GRAMMAR APPROACH</Title>
  <Section position="14" start_page="69" end_page="69" type="concl">
    <SectionTitle>
NUCLEAR-INITIAL SYLLABLE MUCLEAR-FINAL SYLLABLE
</SectionTitle>
    <Paragraph position="0"> Recent research provides useful criteria for deciding when to ~'ee each type.</Paragraph>
    <Paragraph position="1"> Bell-Berti and Harris report that: The effects of the terminal consonant on the midpoint of the stressed vowel are not as large as those of the initial consonant. In other wordb, the carryover effect of the first consonant on the stre~sed vowel is larger than the anticipatory effect on the second.</Paragraph>
    <Paragraph position="2"> For the purposes of this discussion, let us assume that stressed syllables and syllables with strong vowels are nuclear-iniqial and that other syllables are nuclear-final. It is possible, of course, to formulate junction rules which are not binary, so that a third syllable type whose nucleus was equally joined to both initial ahd final delimiters could be used, We avoid this foumal complication, however, until forced to introduce it by empirical considerations.</Paragraph>
    <Paragraph position="3"> Notice that the use of structure to represent syllables makes it unnecessary to use a feature such as [+syllabic]. In comparing the use of this feature to that of the structural notation proposed, we note that each appears to make distinct claims about the notion syllable.</Paragraph>
    <Paragraph position="4"> Specifically, the feature asserts that a vowel is syllabic, - whereas the tree claims that spkcific sequences of segmentals constitute syllables whose nuclear element is a particular segment.</Paragraph>
    <Section position="1" start_page="69" end_page="69" type="sub_section">
      <SectionTitle>
Node Labels
</SectionTitle>
      <Paragraph position="0"> Turning now to the matter of node labels, we observe that ih practire it is desirable to further subcategorize D and W in terms of more specific articulation classes.</Paragraph>
      <Paragraph position="1"> We therefore define D to include obstruent consonants (C) , liquids (L) , glides (G) , and null ) For W, vowels (V) and liquids (L) are indicated, and perhape in some cases even continuant obstruents, assuming that expressions such as vocative &amp;quot;pssst&amp;quot; are to be analyzed as syllables also. We note parenthetically that glides (G) are suspect, since they appear to be functional variants of vowels, i,e, vowels functioning delimitively. This, however, is not a problem, since the use of J-rules to represent articulatory structures makes it just as feasible to consonantalize a vowel by rule as it is $n the semantic component to nominalize a verb by qule. In short, the ae of junction trees to represent articulatory structure brings a great deal of descriptive power to bear, should we need it.</Paragraph>
      <Paragraph position="2"> Thus we supplant D and W with more descriptively spedfic node labels and (append to them some element of their respective vocabularies as terminal units, as illustrated by Figure 7.</Paragraph>
      <Paragraph position="3"> Figure 7.</Paragraph>
      <Paragraph position="4"> The significance of V2 and V3 as non-terminal labels is that of semi-syllable and syllable, respectively. Bear in mind that the operation symbols appearing between operands are representative of the artcculatorv junctions (transitions) between them. Hence non-terminal nodes symbolize articulatory sequences consisting of the phonemes they dominate plus the transitions necessary to account for continuous movement from one distinctive vocal tract state to the next. This signifies, in effect, that glven a junction instruction of the form X O Y = 2, there exists a transition T = O(X,~), such that XW is a continuous articulatory sequence Z consisting of the distinctive units X and Y mediated by transitional T. This aspect of the fornulation is advanced as an attem~ t to satisfy the need for phonological notation potentially capable of explicating both the discrete segmental elements of which the speech chain is composed, and the co-articulatory transitions which connect them in live speech.</Paragraph>
      <Paragraph position="5"> The practical effect of the foxmulation is that one's attention is drawn not to a yelatively limited set of radical phonological changes, but to the co-articulatory effect of every junction on its operands, regardless of its subtlety.  Both initial and final syllable delimiters frequently consist of clusters of segments rather than discrete segments.</Paragraph>
      <Paragraph position="6"> An analysis of such clusters shows that notable assimilative forces are involved.</Paragraph>
      <Paragraph position="7"> We view this as a form of articulatory subordination, and, consequently, use subjunction as the basic junction type for treating such clusters. The fact that articula~ion trees are capable of showing a variety of compositional arrangements makes it possible to give whatever internal structure for such clusters as seems to be operative.</Paragraph>
      <Paragraph position="8"> Thus for strand, where tr seem to  .be more closely associated than st this can be explicitly represented.</Paragraph>
    </Section>
    <Section position="2" start_page="69" end_page="69" type="sub_section">
      <SectionTitle>
Multi-syllable Words
</SectionTitle>
      <Paragraph position="0"> Let us now consider how multi-syllable words may be given in the form of articulation trees. The procedure, briefly, is as follows, using Bambi and Donna as the words to be diagrammed:  An interjunction is constructed using syllable-final and syllable-initial constituents, (The label node is given as C since b - seems to exert assimilative force over m.) (4) null The label node of the sub juxlction attaches to the more heavilystressed syllable.</Paragraph>
      <Paragraph position="1"> (5) The in*tial delimiter of the more weakly-stressed syllable becomes the intersect node.</Paragraph>
      <Paragraph position="2">  An interesting result of the not ,tion is that stress is no longer a property of vowels, but of entire syllables, i.e. the delimiters and the vnwe1. Further, stress reflects a relation between constituents, so that no features expressing stress values are necessary.</Paragraph>
      <Paragraph position="3"> Phraa es Phrases are diagrammed by introducing prosodic constituents (B) to which word-trees are subo~dinated.</Paragraph>
      <Paragraph position="4"> (Refer to Figure 5.) The ranking syllable, 1.e. the pne receiving primary stress, joins to the prosodic constituent. The notation is intended to reflect the simultaneous execution of segrneni-a1 and supra-segmental units during the articulatory process, in a way comparable to the multitudinous internal manipulations of an engine as one turns a crank. The crank of tbe articulatory apparatus is the diaphragm and other musculature which provide energy and assume other symboI cally significant states at certain intervals during the executioh of the seementals. Prosodic constituents result in the specific intonational contours we hear superimposed over syllables, words, and phrases.</Paragraph>
      <Paragraph position="5"> While both segmental and suprasegmental constituents are coded in the context of senantic data, we emphasize again that A-trees contain only articulatory data. Thds, if A-trees are compared to the customary representations of generative phonology, as typified by those given by  Chomsky and Halle (cohpare Figures 5 and 9), it win be noted that! the syntacto-semantic superstructure of the regular trees are replaced by an artfculatorp ssperstructure fn the A-trees, The rationale for this departure from standard practice is not only motivated by the requirement impased by the theory (that data types not be intermingled), but also by the observation that the regular trees tend to neglect prosodic articulatory phenomena. When inPS ormation .relating to these phenbmena is incorporated into articulation trees, it replaces the usual superstructure of S's, NP's, and other similar lables in a natural way. The prosodic constituents thus introduced are comparable in their function to the intonation contours associated by rule with segmental sequences in the systbm proposed by Leben.  Tha proposed system of phonological description makes possible m Lntexesting hypothesis regarding many of the features used in current iescriptions. Specifically, if A-trees are in some senge a reflection of actual articulatory processes, then phonological representations whfch do not use trees wili consist of an intermixture of functional and categorial lables (features). For exaxriple, ff trees are used to represent the relations bktween subject, verb, and object, it is not necessary to label the subject as such or the object as such, since structural relations make these notions explfcit. If trees were not used to represent sentence structure, however, functional labels would have cp be used.</Paragraph>
      <Paragraph position="6"> Similarly, it follows that if trees are an appropriate medium for phonological description, but have not been used, then functional and categorial information are intermingled h current descriptions. If this is true, then it should be possible to abstract functional information away (and consequently not write it in feature form) by elaborating A-tree notation.</Paragraph>
      <Paragraph position="7"> While the proposed system is still in its infancy, so to speak, some interesting initial observations in this regard can be made at this time. First, major category features become node labels in a natural way, thus suggesting why the formal illusion exists that a change, for example, of [+cons] + [-cons] is equal in magnitude to a change of i-hroice] - [-voice] Second, [tsyllabic] ([?consonantal] and [&amp;vocalic] are also used in some systems) are functional labels and need not be wr'itten if syllables are given as tree structures. Third, stress at the segmental level and unmarked pitch at the prosodic level become implicit in structure in terms of the rank of operand's in articulakory subjunction and need not be specified by feature.</Paragraph>
      <Paragraph position="8"> While it is beyond the ecop8 of this paper to elaborate this point further, it is without doubt the most interesting and provocative consequence of the research to date.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML