File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/94/j94-3001_abstr.xml

Size: 10,089 bytes

Last Modified: 2025-10-06 13:48:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="J94-3001">
  <Title>Regular Models of Phonological Rule Systems</Title>
  <Section position="2" start_page="0" end_page="332" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> Ordered sets of context-sensitive rewriting rules have traditionally been used to describe the pronunciation changes that occur when sounds appear in different phonological and morphological contexts. Intuitively, these phenomena ought to be cognitively and computationally simpler than the variations and correspondences that appear in natural language syntax and semantics, yet the formal structure of such rules seems to require a complicated interpreter and an extraordinarily large number of processing steps. In this paper, we show that any such rule defines a regular relation on strings if its non-contextual part is not allowed to apply to its own output, and thus it can be modeled by a symmetric finite-state transducer. Furthermore, since regular relations are closed under serial composition, a finite set of rules applying to each other's output in an ordered sequence also defines a regular relation. A single finite-state transducer whose behavior simulates the whole set can therefore be constructed by composing the transducers corresponding to the individual rules. This transducer can be incorporated into efficient computational procedures that are far more economical in both recognition and production than any strategies using ordered rules directly. Since orthographic rules have similar formal properties to phonological rules, our results generalize to problems of word recognition in written text.</Paragraph>
    <Paragraph position="1"> The mathematical techniques we develop to analyze rewriting rule systems are not limited just to that particular collection of formal devices. They can also be applied to other recently proposed phonological or morphological rule systems. For example, we can show that Koskenniemi's (1983) two-level parallel rule systems also denote regular relations. Section 2 below provides an intuitive grounding for the rest of our discussion by illustrating the correspondence between simple rewriting rules and transducers. Section 3 summarizes the mathematical tools that we use to analyze both rewriting and two-level systems. Section 4 describes the properties of the rewriting rule formalisms we are concerned with, and their mathematical characterization  According to these rules an underspecified, abstract nasal phoneme N appearing in the lexical forms iNpractical and iNtractable will be realized as the m in impractical and as the n in intractable. To ensure that these and only these results are obtained, the rules must be treated as obligatory and taken in the order given. As obligatory rules, they must be applied to every substring meeting their conditions. Otherwise, the abstract string iNpractical would be realized as inpractical and iNpractical as well as impractical, and the abstract N would not necessarily be removed from iNtractable.</Paragraph>
    <Paragraph position="2"> Ordering the rules means that the output of the first is taken as the input to the second.</Paragraph>
    <Paragraph position="3"> This prevents iNpractical from being converted to inpractical by Rule 2 without first considering Rule 1.</Paragraph>
    <Paragraph position="4"> These obligatory rules always produce exactly one result from a given input. This is not the case when they are made to operate in the reverse direction. For example, if Rule 2 is inverted on the string intractable, there will be two results, intractable and iNtractable. This is because intractable is derivable by that rule from both of these strings. Of course, only the segments in iNtractable will eventually match against the lexicon, but in general both the N and n results of this inversion can figure in valid interpretations. Compare the words undecipherable and indecipherable. The n in the prefix un-, unlike the one in in-, does not derive from the abstract N, since it remains unchanged before labials (c.f. unperturbable). Thus the results of inverting this rule must include undecipherable for undecipherable but iNdecipherable for indecipherable so that each of them can match properly against the lexicon.</Paragraph>
    <Paragraph position="5"> While inverting a rule may sometimes produce alternative outputs, there are also situations in which no output is produced. This happens when an obligatory rule is inverted on a string that it could not have generated. For example, iNput cannot be generated by Rule 1 because the N precedes a labial and therefore would obligatorily be converted to m. There is therefore no output when Rule 1 is inverted on iNput.</Paragraph>
    <Paragraph position="6"> However, when Rule 2 is inverted on input, it does produce iNput as one of its results.</Paragraph>
    <Paragraph position="7"> The effect of then inverting Rule 1 is to remove the ambiguity produced by inverting Rule 2, leaving only the unchanged input to be matched against the lexicon. More generally, if recognition is carried out by taking the rules of a grammar in reverse order and inverting each of them in turn, later rules in the new sequence act as filters on ambiguities produced by earlier ones.</Paragraph>
    <Paragraph position="8"> The existence of a large class of ambiguities that are introduced at one point in the recognition process and eliminated at another has been a major source of difficulty in efficiently reversing the action of linguistically motivated phonological grammars. In a large grammar, the effect of these spurious ambiguities is multiplicative, since the information needed to cut off unproductive paths often does not become available until after they have been pursued for some considerable distance. Indeed, speech understanding systems that use phonological rules do not typically invert them on strings but rather apply them to the lexicon to generate a list of all possible word forms (e.g.</Paragraph>
    <Paragraph position="9"> Woods et al. 1976; Klatt 1980). Recognition is then accomplished by standard table- null Ronald M. Kaplan and Martin Kay Regular Models of Phonological Rule Systems lookup procedures, usually augmented with special devices to handle phonological changes that operate across word boundaries. Another approach to solving this computational problem would be to use the reversed cascade of rules during recognition, but to somehow make the filtering information of particular rules available earlier in the process. However, no general and effective techniques have been proposed for doing this.</Paragraph>
    <Paragraph position="10"> The more radical approach that we explore in this paper is to eliminate the cascade altogether, representing the information in the grammar as a whole in a single more unified device, namely, a finite-state transducer. This device is constructed in two phases. The first is to create for each rule in the grammar a transducer that exactly models its behavior. The second is to compose these individual rule transducers into a single machine that models the grammar as a whole.</Paragraph>
    <Paragraph position="11"> Johnson (1972) was the first to notice that the noncyclic components of standard phonological formalisms, and particularly the formalism of The Sound Pattern of English (Chomsky and Halle 1968), were equivalent in power to finite-state devices despite a superficial resemblance to general rewriting systems. Phonologists in the SPE tradition, as well as the structuralists that preceded them, had apparently honored an injunction against rules that rewrite their own output but still allowed the output of a rule to serve as context for a reapplication of that same rule. Johnson realized that this was the key to limiting the power of systems of phonological rules. He also realized that basic * . / ...... rewriting rules were subject to many alternahve modes of apphcahon offering different expressive possibilities to the linguist. He showed that phonological grammars under most reasonable modes of application remain within the finite-state paradigm.</Paragraph>
    <Paragraph position="12"> We observed independently the basic connections between rewriting-rule grammars and finite-state transducers in the late 1970s and reported them at the 1981 meeting of the Linguistic Society of America (Kaplan and Kay 1981). The mathematical analysis in terms of regular relations emerged somewhat later. Aspects of that analysis and its extension to two-level systems were presented at conferences by Kaplan (1984, 1985, 1988), in courses at the 1987 and 1991 Linguistics Institutes, and at colloquia at Stanford University, Brown University, the University of Rochester, and the University of Helsinki.</Paragraph>
    <Paragraph position="13"> Our approach differs from Johnson's in two important ways. First, we abstract away from the many details of both notation and machine description that are crucial to Johnson's method of argumentation. Instead, we rely strongly on closure properties in the underlying algebra of regular relations to establish the major result that phonological rewriting systems denote such sets of string-pairs. We then use the correspondence between regular relations and finite-state transducers to develop a constructive relationship between rewriting rules and transducers. This is accomplished by means of a small set of simple operations, each of which implements a simple mathematical fact about regular languages, regular relations, or both. Second, our more abstract perspective provides a general framework within which to treat other phonological formalisms, existing or yet to be devised. For example, two-level morphology (Koskenniemi 1983), which evolved from our early considerations of rewriting rules, relies for its analysis and implementation on the same algebraic techniques. We are also encouraged by initial successes in adapting these techniques to the autosegmental formalism described by Kay (1987).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML