XML Viewer - w98-0902

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-0902_metho.xml
Size: 22,266 bytes
Last Modified: 2025-10-06 14:15:06
<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-0902">
  <Title>Computing Declarative Prosodic Morphology</Title>
  <Section position="4" start_page="11" end_page="13" type="metho">
    <SectionTitle>
2 Declarative Prosodic Morphology
</SectionTitle>
    <Paragraph position="0"> Focussing on cases of 'nonconcatenative' root-and-pattern morphology, Declarative Prosodic Morphology (DPM) starts with an intuition that is opposite to what the traditional idea of templates or fixed phonological shapes (McCarthy 1979) suggests, namely that shape variance is actually quite common and should form the analytical basis for theoretical accounts of PM. Besides the Tonkawa case (fig.l), shape variance is also at work in Modern Hebrew (MH) inflected verb forms (Glinert 1989), see fig.</Paragraph>
    <Paragraph position="1">  stem vowels, depending on the affixation pattern.</Paragraph>
    <Paragraph position="2"> This results in three stem shapes CVCVC, CVCC and CCVC. Any analysis that simply stipulates shape selection on the basis of specific inflectional categories or phonological context (e.g. 3sg.f V 3pl or -V .-~ CVCCstem / B 1 past) misses the fact that the shapes, their alternating behaviour and their proper selection are derivable. Derivational repairs by means of 'doubly open syllable' syncope rules (/ga.ma.r-a./ /.gam.ra./) are similarly ad hoc.</Paragraph>
    <Paragraph position="3"> * A first step in developing an alternative DPM analysis of MH verbs is to explicitly recognize alternation of an element X with zero - informally written (X) - as a serious formal device besides its function as a piece of merely descriptive notation (cf. Hudson 1986 for an earlier application to Arabic). In contrast to nonmonotonic deletion or epenthesis, (X) is a surface-true declarative expression (Bird 1995, 93f.). The reader is reminded tRegular MH verbs are traditionally divided into seven verbal classes or binyanim, B I-B7. Except for B4 and B6, which regularly act as passive counterparts of B3 and B4, the semantic contribution of each class is no longer transparent in the modem language. Also, in many cases the root (written ~/'C~ .C~.Cs) is restricted to an idiosyncratic subset of the binyanim.</Paragraph>
    <Paragraph position="4"> An a-templatic treatment of MH prosodic morphology was first proposed by Bat-El (1989, 40ff.) within an unformalized, non-surface-tree, non-constraint-based setting.</Paragraph>
    <Paragraph position="5">  that DP sees grammar expressions as partial formal descriptions of sets of phonological objects. The former reside on a different ontological level from the latter, in contrast to traditional object-to-object transformations on the same level. Hence a preliminary grammar expression g(V1)m(V2)r for a Hebrew stem (with abstract stem vowels) denotes the set {gmr, gVlmr, gmV2r, gVlmV2r). Note that the (X) property as attributed to segmental positions is distinctive - in contrast to stem vowels root segments do not normally alternate with zero, and neither do affix segments in an important asymmetry with stems. This point is reinforced by the exceptions that do exist: phonologically unpredictable C/~ alternation occurs in some MH stems, e.g. natan/lakax 'he gave/took' vsfi-ten/ji-kax 'he will give/take'; by surface-true (n/l) encoding we can avoid diacritical solutions here.</Paragraph>
    <Paragraph position="6"> * Step two uses concatenation to combine individual descriptions of stems and affixes, besides connecting segmental positions within these linguistic entities. Since, as we have just seen, a single description can denote several objects of varying surface stnng length, concatenation (^) at the description level is actually powerful enough to describe 'nonconcatenative' morphological phenomena. In DPM these do not receive independent ontological status (cf. Bird &amp; Klein 1990 and Gafos 1995 for other formal and articulatory-phonological arguments leading to the same conclusion). A more detailed description of the 3pl.fut. inflected form of x~g.m.r might therefore be j^i^g'(V1)^m^(V2)r^u. In order to allow for paradigmatic 2 generalizations over independent entities such as root and stem vowel pattern within concatenated descriptions, a hierarchical lexicon conception based on multiple inheritance of named abstractions can be used (cf. Riehemann 1993).</Paragraph>
    <Paragraph position="7"> * Step three conjoins a word form description with declarative syllabification and syllable structure constraints in order to impose prosodic well-formedness conditions. For Modem Hebrew (and Tonkawa), the syllable canon is basically CV(C).</Paragraph>
    <Paragraph position="8"> Expressed in prosodic terms, complex codas and onsets are banned, while an onset must precede each syllable nucleus. These syllable roles are established in the first place by syllabification constraints that exploit local sonority differences between successive segments (Walther 1993). Alltogether, the ensemble 2See Walther (1997) for a discussion of various ways to derive rather than stipulate the syntagmatic pattern of alternating and non-alternating segmental positions within stems.</Paragraph>
    <Paragraph position="9"> of prosodic Constraints indeed succeeds in narrowing down the set for the 3sg.m past tense form to {*.9mr., *.9amr., *.9mar., !.9a.mar.} = /gamar/. For 3pl. future tense B1, however, an unresolved ambiguity remains: in {.jig.me.ru.,.ji.gam.ru.}, only the first element is grammatical. 3 An important observation is that in general there can be no purely phonological constraint to disambiguate this type of situation.</Paragraph>
    <Paragraph position="10"> The reason lies in the existence of minimal pairs with different category. In our case, homophonous /.ji.gam.ru./ is grammatical as 3pl. fut. B2 'they will be finished'. We will return to the analysis of such cases after proposing a specific disambiguation mechanism in the next step.</Paragraph>
    <Paragraph position="11"> * Step four eliminates the remaining ambiguity by invoking an Incremental Optimization Principle (IOP): &amp;quot;For all (X) elements, prefer the zero altemant as early as possible&amp;quot;. &amp;quot;Early&amp;quot; corresponds to traditional left-to-right directionality, but is meant to be understood w.r.t, the speech production time arrow. &amp;quot;As possible&amp;quot; means that IOP application to a (X) position nevertheless realizes X if its omission would lead to a constraint conflict. Hence, the IOP correctly rules out the second element of {.jig.me.ru.,*.ji.9ara.ru.}. This is because .ji.gam.ru. represents a missed chance to leave out /a/, the earlier one of the two stem vowels. The reader may verify that the IOP as it stands also accounts for the Tonkawa data of fig. I. Tonkawa lends even clearer support to IOP's left-to-right nature due to the larger number of V/O vowels involved. As a limiting case, the IOP predicts the possibility of vowelless surface stems, e.g. formed by two root consonants combined with vowel-final prefix and suffix.</Paragraph>
    <Paragraph position="12"> This prediction is strikingly confirmed by MH forms like te-lx-i 'you (sg.f.) will go' ~/(h).l.x, ti-kn-u 'you/they (pl.) will buy' ~/'k.n.O, ti-tn-i 'you (sg.f.) will give' ~/(n).t.n; similar cases exist in Tigdnya.</Paragraph>
    <Paragraph position="13"> There can be no meaningful prosodic characterization of isolated CC stem shapes; only a wordform-based theory like the present one may explain why these forms exist.</Paragraph>
    <Paragraph position="14"> Note that, conceptually, IOP is piggybacked on autonomous DP-style constraint interaction. It merely filters the small finite set of objects described by the conjunction of all constraints. From another angle, IOP can be seen as a single context-free sub3Note that the prosodic view explains the pronounced influence of (C)V affixes on the shape of the whole word: they provide a nonalternating syllable nucleus which can host adjacent stem consonants.</Paragraph>
    <Paragraph position="15">  stitute for the various syncope rules employed in former transformational analyses. The claim is that fixed-directionality-IOP is the only such mechanism needed to account for PM phenomena.</Paragraph>
    <Paragraph position="16"> A distinguishing feature of the IOP is its potential for an economical procedural implementation in incremental production. If constraint contexts are sufficiently local, the pnnciple can locally decide over (X) nonrealizations and there will be very limited backtracking through delayed detection of constraint violation. Because the IOP stops after finding the first (X) realization pattern that violates no constraints, it has less formal power than global optimization which must always consider all candidates.</Paragraph>
    <Paragraph position="17"> Moreover, the IOP supports economic communication, as it leads to shortest surface forms wherever possible. Finally, at least for root-and-pattern morphologies it can be argued to aid in speech perception as well. This is because the closed class of stem vowel patterns is less informative than open-class root segments. Since IOP-guided vowel omission causes root segments to (statistically) appear at an earlier point in time from the acoustic onset of the word, the IOP hypothesis actively prunes the size of the cohort of competing lexical candidates.</Paragraph>
    <Paragraph position="18"> As a result, unambigous recognition will generally be achieved more quickly during continous lexical access. In sum, the IOP hypothesis not only possesses overall psycholinguistic plausibility but actually gives some processing advantage to shape variance. If future research provides the necessary experimental confirmation, we have yet another case of performance shaping competence.</Paragraph>
    <Paragraph position="19"> * Step five returns to the minimal pairs problem highlighted in step three: what to do with anti-IOP realizations such as that of/a/in/.ji.gam.ru./for B2 fut.? The answer is (prosodic) prespecification. A surface-true constraint demands that B2 future and infinitive as well as all of B3, B4 must have an onset role for the first stem element. Thus, the possibility of IOP eliminating the first stem vowel is blocked by the constraint inconsistency that arises for the first stem element: either syllabification licenses an incompatible coda or first and second stem segment together form an illformed onset cluster. Note that if the constraint is lexicalized as part of the grammatical description of first stem position, it will have a maximally local context, referring to just the position itself. In general, DPM analyses pay much attention to proper attachment sites of constraints in order to maximize their locality.</Paragraph>
    <Paragraph position="20"> The MH verbal suffix -et (fem.sg.pres.) illustrates that sometimes another, segmental mode of prespecification is useful. This suffix is always preceded by a syllable ending in /el, although IOP application alone would e.g. prefer */gom.ret/over/go.me.ret/ 'she finishes'. The effect is morpheme-specific since other -VC suffixes behave as expected here: gomrim/ot 'they (masc./fem.) finish'. One solution is to let part of the suffix definition be a constraint statement which demands that the segment two positions to its left must be a front vowel. This move captures both the stability and the quality of this vowel at the same time. (Apophony constraints ensure that the second stem vowel is never/i/except in B5, which significantly has a different suffix -a in place of -et). Note that prespecifying the presuffixal segment to be in an onset position would not work.</Paragraph>
  </Section>
  <Section position="5" start_page="13" end_page="89" type="metho">
    <SectionTitle>
3 On implementing analyses
</SectionTitle>
    <Paragraph position="0"> In the following I show how to implement a toy fragment of MH verbs using the MicroCUF formalism, a typed, feature-based constraint-logic programming language suitable for natural language modelling.</Paragraph>
    <Paragraph position="1"> MicroCUF implements a subset of CUF (D6rre &amp; Dorna 1993), inheriting its formal semantics. It was initially delevoped by the author to overcome efficiency problems with CUF's original type system.</Paragraph>
    <Paragraph position="2"> Additionally, its simpler implemenation provides an open platform for experimental modifications, as needed e.g. for parsing and generation with DPM.</Paragraph>
    <Paragraph position="3"> After briefly introducing the essentials of MicroCUF first, the MH analysis is developed and explained.</Paragraph>
    <Section position="1" start_page="13" end_page="14" type="sub_section">
      <SectionTitle>
3.1 The MicroCUF constraint formalism
</SectionTitle>
      <Paragraph position="0"> This section assumes a basic knowledge of Prolog.</Paragraph>
      <Paragraph position="1"> Like in Prolog, MicroCUF variables start with upper-case letters or _ , whereas relational symbols, features and simplex types start in lowercase; % marks a comment (fig. 3a). Relations like member are written in functional notation, with a notationally distinguished result argument on the nghthand side of : = and the relation symbol plus its (optional) arguments on the lefthand side. Subgoals like member (Elem) Can occur anywhere as subterms. Instead of Prolog's fixed-arity first order terms, MicroCUF has typed feature terms as its basic data structures. As illustrated in fig. 3b, subterms are explicitly conjoined with &amp; or disjunctively combined with ;, while only type terms may be prefixed by the negation operator ,-.,. Features like left, cat are separated from their righthand value terms by :. Terms may be tagged by conjunction with a variable (vl), allowing for the expression of structure sharing through mul- null tiple occurences of the same variable. Feature appropriateness declarations ( : : ) ensure that both the term in which a feature occurs and its value are typed. For comparison, the result value of fs appears in HPSG-style notation under fig. 3c.</Paragraph>
      <Paragraph position="3"> fs:=cat: (~((b2;b3;b5)&amp;past)&amp;Vl)&amp;left:cat:Vl.</Paragraph>
      <Paragraph position="4"> phonlist::\[cat:categories\].</Paragraph>
      <Paragraph position="5"> segmental positions to the left - a frequent situation in phonological contexts - we supplement it with a new feature left to yield bidirectional lists. For this doubly-linked list encoding to be wellbehaved, a step right followed by a step left is constrained to return to the same position Self (3), thus yielding cyclic feature structures. Next, the value of the feature cat at the current position is connected with its right neighbour (3-4). In the face of our recursively structured lists this makes morphological and other global categorial information locally accessible at each segmental position. Finally, relations to incrementally classify each segmental position as word-initial, medial or wordfinal and to impose prosodic constraints are added in (5-6).</Paragraph>
    </Section>
    <Section position="2" start_page="14" end_page="89" type="sub_section">
      <SectionTitle>
3.2 Modern Hebrew verbs in MicroCUF
</SectionTitle>
      <Paragraph position="0"> Below I present a concrete MicroCUF grammar in successive pieces. It encodes a toy fragment of MH verbs and represents a simplified excerpt from a much larger computational grammar. For lack of space, the type hierarchy - specifying syllable roles, segments, morphological categories and word-peripheral position - and the definition of syllabi fy (formalized in Walther 1995) have been omitted.</Paragraph>
      <Paragraph position="1"> Let us start the explanation with a basic concatenation relation which adds a position Self in front of some string of Segments (1-6).</Paragraph>
      <Paragraph position="2">  Here, the familiar recursive first-rest encoding of lists translates into self-right features. This alone makes self and (arbitrarily long) right-context references possible. To support looking one or more  x_0(_, Segments) := segments.</Paragraph>
      <Paragraph position="3"> x_0(X, Segments) := mark:marked &amp; conc(X, Segments).</Paragraph>
      <Paragraph position="4"> obl(X, Segments) := mark:unmarked &amp; conc(X, Segments).</Paragraph>
      <Paragraph position="6"> The first clause of x_0 (10) realizes the zero alternant by equating in its second argument the Segments to follow with the result argument; the first argument holding x is unused. It gets used in the second clause (11-12), however, where it is prefixed to the following Segments by ordinary concatenation. The value of an additional feature mark specifies that realizing an X position is marked w.r.t, the IOP, whereas no such value is prescribed in the first clause. Instead, the marking there will be supplied later by adjacent instances of either the second x_0 clause or obl (14-15). The latter is the version of concatenation used for specifying obligatory, i.e. nonalternating positions, which consequently are specified as unmarked. Alltogether these means yield fully specified strings w.r.t, markedness information. We will see below how this simplifies an implementation of the IOP.</Paragraph>
      <Paragraph position="7"> As can be seen in the accessor relation is (17), phonological segments are actually embedded under a further feature seg. This treatment enables structure-sharing of segments independent of their syllable roles.</Paragraph>
      <Paragraph position="8"> The syllable shape constraint (18-25) shows first of all that syllable roles are modelled as O,pes under self.</Paragraph>
      <Paragraph position="9">  Lines (19-20) capture the fact that syllable nuclei in MH are always vowels and that every syllable nucleus is preceded by an onset. In (21-22) a nonnuclear position that is an onset may only license preceding non-onsets, thus disallowing complex onsets; similarly for codas in (23). In (27) genetic syllabify is intersected with shape, since segmental positions must be prosodified and conform to language-specific shape restrictions.</Paragraph>
      <Paragraph position="10"> The constraints under (28-30), included for completeness, merely ensure proper termination of segmental strings at the word periphery.</Paragraph>
      <Paragraph position="11">  We proceed in (37-41) with a rudimentary definition of first (vl) and second (v2) stem vowel which is sufficient for our toy fragment.</Paragraph>
      <Paragraph position="13"> The larger grammar mentioned above contains a full binary decision tree for each vowel. Still, even here one can see the use of type formulae like round &amp; ' -hi' to classify segments phonologically.</Paragraph>
      <Paragraph position="14"> Next come a number of exemplary inflectional affixes (42-79), again simplified. The zero affixes (4245, 47-54) are phonologically just like the zero alternant in (10) in taking up no segmental space.</Paragraph>
      <Paragraph position="16"> The segmental content of all other affixes is specified via possibly repeated instances of obl, since affixes are nonalternating. Apart from the respective categotial information, positional type information ' +ini', ' +fin' ensures that prefixes and suffixes are properly restricted to wordinitial and wordfinal position. Note that the glide-initialji- prefix specifies an initial/i/(58) which will be prosodified as onset by means of syllabify. This representational assumption is in line with other recent work in phonological theory which standardly analyzes glides as nonsyllabic high vowels. Hence, even in MH we have a case where segmental classes and prosodic rolesdon't align perfectly.</Paragraph>
      <Paragraph position="17"> To control second stem vowel apophony, some suffixes demand (53,73) or forbid (51) front vowels two positions to their left.</Paragraph>
      <Paragraph position="19"> Others posit the weaker demand vowel -+ front (62,67,78), thus not forbidding consonantal fillings of the position adressed by left : left.</Paragraph>
      <Paragraph position="20"> The stem definition (80-82) for a regular triliteral is parametrized for the three root segments and the inflectional Suffixes to follow.</Paragraph>
      <Paragraph position="22"> Given the informal description in section 2, the succession of obligatory root and altel&amp;quot;nating stem vowel positions now looks familiar. It should be obvious how to devise analogous stem definitions for quadriliterals (e.g. mixfev) and cluster verbs (e.g.</Paragraph>
      <Paragraph position="23"> flirtet).</Paragraph>
      <Paragraph position="24"> A rather simple tabulation of affixes lists (a subset of) the allowable prefix-suffix cooccurrences in the MH verbal paradigm (84-88) before everything is put together in the definition for verbform, parametrized for a list of root segments and Category (90-95). Note how prosodic_prespeci fication is intersected with stem in (93-94), exploiting the power of the description level to restrict stem realizations without diacritical marking of stem vs affix domains on the object level. The subgoal root_letter_tree (92) will be discussed below.</Paragraph>
      <Paragraph position="25"> When proving a goal like verbform( \[g, m, r\], bl&amp;third&amp;pl&amp;fut), the MicroCUF interpreter will enumerate the set of all candidate result feature structures, including one that describes the grammatical surface string jigmeru. An implementation of the IOP, to be described next, must therefore complement the setup established sofar to exclude the suboptimal candidates. While the subtle intertwining of zero alternant preference and constraint solving described above has its theoretical merits, a much simpler practical solution was devised. In a first step, the small finite set of all candidate solutions for a goal is collected, together with numerical 'disharmony' values representing each candidate's degree of optimality. Disharmony is defined as the binary number that results from application of the mapping {unmarked ~ 012, marked ~ 102} to the left-to-right markedness vector of a segmental string: e.g., jolioxgolal0moxroluol yields the disharmony value 010101100101012 = 552510 &gt; 54771o = 010101011001012 from joli01golmoleloroluol. Step two is a straightforward search for the candidate(s) with minimal disharmony.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML