File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/81/p81-1005_metho.xml
Size: 15,650 bytes
Last Modified: 2025-10-06 14:11:24
<?xml version="1.0" standalone="yes"?> <Paper uid="P81-1005"> <Title>REPRESENTATION</Title> <Section position="4" start_page="23" end_page="24" type="metho"> <SectionTitle> THE PROBLEM OF MULTIPLE SOLUTIONS </SectionTitle> <Paragraph position="0"> It should be pointed out that most often several sets of combinations of underlying representations and phonological rules can be used to derive the same pronounciations.</Paragraph> <Paragraph position="1"> This could happen in several ways. It could be unclear what the UR is, and different URs together winh different rules could derive that same pronounciatons, i.e. the directionality of the rule could be unclear.</Paragraph> <Paragraph position="2"> The difference in the pronounciation of the last segment of this morpheme, d vs. t, is called an alternation. Given this alternation, one could make two hypotheses. One could hypothesize that the UR is /ad/ and that there is a rule which changes d to t when it occurs at the end of a word, or one could hypothesize that the UR is /at/ and that there is a rule which changes t to d between a's. Also some phenomena could be explained by a single more general rule or by several more specific rules.</Paragraph> <Paragraph position="3"> Generally, there are two approaches that could be taken to deal with the problem of multiple possible solutions. One could attempt to impose restrictions on what could constitute a valid solution, or one could use an evaluation procedure to decide in cases of multiple possible solutions. One could also use both of these approaches; in which case the more restriction, the less evaluation is necessary. An original single evaluation criterion - 'simplicity', as manifested in the number of feature specifications used - has not proved workable. ALso no particular proposed restrictions have been embraced by the v~st majority of phonologists.</Paragraph> <Paragraph position="4"> Individual phonologists are generally guided in their evaluations of solutions, i.e. sets of rules and URs, by various criteria. The weighting of these criteria is left open.</Paragraph> <Paragraph position="5"> In this connection the 'codifying function' of the development of expert systems is particulary relevant, i.e. in order to be put into a program the criteria must be formalized and weighted.j5\] Although it has sometimes been claimed that no set of discovery procedures can be sufficient tO produce phonological analyses, this program is intended to demonstrate the feasibility of a procedural definition of the theory.</Paragraph> <Paragraph position="6"> The three most widely used criteria and the manner in which they are embedded in PHONY will now be discussed.</Paragraph> <Section position="1" start_page="23" end_page="24" type="sub_section"> <SectionTitle> Phonological Predictability </SectionTitle> <Paragraph position="0"> This involves the preference of solutions based phonological environment rather than to those in which reference is made to morphological or lexical categories or involving the division of the lexicon into arbitrary classes. In other words, in doing phonological analysis the categories or meanings of morphemes will not be considered, unless no solution can be found based on just the sounds or sound sequences involved. This criterion is embodied in PHONY, since no information about morphological or syntactic categories is available to PHONY. If PHONY cannot handle an alternation by reference to phonological environment, it will return that this is an 'interesting case'. The ability to identify the *interesting cases' is a most valuable one, since these are often the cases that lead to theory modification. It should be mentioned that PHONY could readily be extended (Extension I) to handle a certain range of syntactically or morphologically triggered phonological rules. This would involve including in the input information about syntactic category, and, where relevant, morphological category of the constituent morphemes. This informaton would be ignored unless PHONY was unable to produce a solution, i.e. would have returned &quot;interesting cases&quot;'. It would then search for generalizations based on these categories.</Paragraph> <Paragraph position="1"> Naturalness This involves the use of knoweldge about which proceeses are 'natural' to decide between alternate solutions, i.e. solutions involving natural processes are preferred.</Paragraph> <Paragraph position="2"> A process found in many languages is judged to be 'natural'. Although natural processes are often phonetically plausible, this is not always the case. It should be mentioned that not only is 'naturalness' an arbiter in case of several possible solutions, but it is also a heuristic to lead the investigator to plausible hypotheses which he can pursue. PHONY contains a catalogue of natural processes. When an alternation looks as if it might be the result of one of these processes, the entire input corpus of strings is tested to see.if this hypothesis is valid.</Paragraph> <Paragraph position="3"> Simplicity 'Simplicity' was mentioned above, while it is no longer the only criterion, it is still a primary one. It is reflected in PHONY in a series of attempts to make rules more general, i.e. combine several hypothesized rules into a single hypothesized rule. The more general rules require fewer feature specifications. Also the smaller number of rules can lead to a reduced number of feature specifications.</Paragraph> <Paragraph position="4"> The various proposed constraints on what can be valid solutions generally would correlate with the differences in the testing process of PHONY. Most of these involve differences in allowable orderings of rules (e.g.</Paragraph> <Paragraph position="5"> 'unrestricted extrinsic ordering', 'free reapplication', 'direct mapping'; cf. \[3\]). At present PHONY's testing process involves checking if hypothesized rules hold, i.e. do not have counterexemples, in the phonetic representations (such a criterion disallows opacity of type l; of. \[4\]). PHONY could be extended (Extension 2) to allow the user to choose from several of the proposed constraints. This would involve using different testing functions. This extension would allow analyses of the same data under different constraints to easily be compared. Additionally, new constraints could be added and tested.</Paragraph> </Section> </Section> <Section position="5" start_page="24" end_page="24" type="metho"> <SectionTitle> STRUCTURE OF PHONY </SectionTitle> <Paragraph position="0"> PHONY can be divided into three major parts~ ALTFINDER, NATMATCH, and RULERED.</Paragraph> </Section> <Section position="6" start_page="24" end_page="25" type="metho"> <SectionTitle> ALTFINDER </SectionTitle> <Paragraph position="0"> ALTFINDER takes the input sting of phonetic symbols and indices indicating instances of the same morpheme, as in (3), and returns for each morpheme in turn a representation including the non-alternating segments and list of alternations with the contexts in which each alternant occurs, for example, for morpheme I, as in (9).</Paragraph> <Paragraph position="2"> This process involves comparing in turn each instance of a given key morpheme with the current hypothesized underlying representation for that morpheme, and for each case of alternation storing in N groups the different context strings in which the N alternants occur. The comparison is complicated by the common processes of epenthesis (insertion of a segment) and elision (deletion of a segment), and occasionally by the much more rarely occurring methathesis (interchange in the positions of two segments). These processes are illustrated in (10).</Paragraph> <Paragraph position="3"> Therefore in cases where the segments being compared are not identical it is necessary to ascertain whether they are variants of a single underlying segment or one of these processes has applied. The possibilities are illustrated in (11).</Paragraph> <Paragraph position="4"> (ii) Given two pronounciations of the same The criteria used to decide between these relationships are (a) degree of similarity in each of the conceivable associations, and (b) a measure of the similarity of the rest of the strings for each of the conceivable associations.</Paragraph> <Paragraph position="5"> ALTFINDER yields a list of alternations based on segments, as in (9). This is then converted into a list of alternations based on features.</Paragraph> <Paragraph position="6"> former must differ by at least one feature, the new list must contain as many alternations and normally contains more alternations, Where previously for each alternation in a segment there was a list of strings where each alternant occurred, now for each alternation in a feature there are two lists - one with the strings where a positive value for that feature occurred and the other where a negative value occurred. It should be noted that the elements of these lists, i.e. strings, together with the feature alternating, its value, and an indication of which segment in the string contains the feature, are all potentially rules. They bear the same information as standard phonological rules. Compare the representations in (13); these are for the alternations in morpheme 5 in (3).</Paragraph> <Paragraph position="8"> to the rules t -> d / # a + a # d -> t / # a # , i.e. respectively, one can't pronounce t in the environment # a + a # but rather must pronounce d, and one can't pronounce d in the environment # a # but rather must pronounce t. The latter rule and the second representation (both without the initial two segments - in the interests of space) in It is often the case that one or both of these potential 'rules' will be valid, i.e. would be generalizations that would hold over the pronounciations represented in the input. These 'rules' would, however, be much less general than those which are found in phonological analyses. It is assumed that speaker/hearer/language learners can and do generalize from these specific cases to form more general rules. If this were not the case how could speakers correctly pronounce morphemes in new environments.</Paragraph> <Paragraph position="9"> Within the theory the criterion of simplicity is sensitive to these generalizations in that such generalizations reduce the number of feature specifications. Within PHONY the preference for more general rules is manifested by continually trying to generate and test more general rules resulting from the coalescing or combining of two or more specific rules.</Paragraph> <Paragraph position="10"> Recall that the representation of the segments involved a feature matrix with positive or negative specifications for each feature. In order to generate more general rules this repuesentation is modified to two matrices for each segment - one representing those features which must be positive in the environment and the other for those features which must be negative. The generalization process involves taking the 'greatest common denominator' (GCD) of the positive and negative values of the segments of the environments of two separate 'rules'. In the interests of space an abbreviated example of the GCD operation is given in (15).</Paragraph> <Paragraph position="12"> The GCD operation has generated a more general rule. If the original two rules are a manifestation of a more general rule, the generalized rule must not involve or make reference to the the initial segment of the former rule. Notice also that in the GCD the VOICE feature does not have to be positive or negative; if the two original rules are a manifestation of a single rule the specification of the VOICE feature in the alternating segment must not be relevant.</Paragraph> <Paragraph position="13"> NATMATCH After the alternations in terms of segments that were output by ALTFINDER have been changed into alternations in terms of features (12) and after these have been transformed from single matrices into double matrices, the resulting &quot;rules&quot; are sent to NATMATCH. NATMATCH compares these &quot;rules&quot; with the data base of common phonological processes. This involves pattern matching.</Paragraph> <Paragraph position="14"> If a match occurs the entire input corpus is tested to find out if it can be established whether this rule or constraint is valid for this language. If Extension 2 were implemented, this testing process would differ for the different versions of the theory. If the validity can be established, the underlying representations for the morpheme is adjusted and the rule is added to the list of established rules. Common processes in the data base are organized by the feature which is alternating, and among those processes involving the alternation of a given feature the most common process is listed and thus tested first. If it can be shown to be valid, it is added to a list of established rules. It should be mentioned that ALTFINDER makes use of this list, and if an alternation that it discovers can be handled by an established rule, the tentative underlying representation is so adjusted and the alternation need not be passed on to the rest of the program. If within NATMATCH no matches are found in the data base or if the validity of the matches cannot be established, the alternation is added to the list of those as yet not accounted for.</Paragraph> </Section> <Section position="7" start_page="25" end_page="26" type="metho"> <SectionTitle> RULERED </SectionTitle> <Paragraph position="0"> RULERED takes the generated &quot;rules&quot; that have not been established. It establishes which of these are valid and takes GCDs to generalize these as much as possible. This is done by going through all the rules involving a certain feature and generating the minimal number of equivalence classes of &quot;rules&quot; and combined (GCDed) &quot;rules&quot; which are valid. The resulting generalized rules have the largest matrices, i.e. the largest set of feature specification@, which all the forms undergoing these rules have in common.</Paragraph> <Paragraph position="1"> However, the elimination of some of these features specification might still result in valid rules. The rules with minimal matrices, i.e. minimal number of feature specifications (recall the &quot;simplicity&quot; criterion), might be termed lowest commmon denominators (LCDs). These are produced by attempting in turn to eliminate each segment in GCDed rule; the new rule is generated and tested, and if valid the segment is out, otherwise it remains. Then an attempt is made to eliminate in turn each feature specification in the remaining segments, again generate and test. Finally, all the established rules are combined, where possible, according to the many abbreviatory conventions of Generative Phonology (cf.</Paragraph> <Paragraph position="2"> \[2\]). This is done on the basis of the formal properties of the rules. For example, if two generated rules are identical except that one has an additional segment not present in the other, these can be into a single rule; parentheses allow the inclusion of optional segments in the environment of a rule. In addition, all the rules generated above involve a change of only a single feature specification. If there are several rules which are identical except that a different feature specification is changed, i.e. the two changes occur in the same environment, they can be combined into a single rule: in this particular environment both specifications change.</Paragraph> </Section> class="xml-element"></Paper>