File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/80/j80-1001_metho.xml
Size: 17,893 bytes
Last Modified: 2025-10-06 14:11:18
<?xml version="1.0" standalone="yes"?> <Paper uid="J80-1001"> <Title>Cascaded ATN Grammars</Title> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4. Cascaded ATN's </SectionTitle> <Paragraph position="0"> The advantages of having semantic and pragmatic information available at early stages of parsing natural language sentences have been demonstrated in a variety of systems, i Ways of achieving such close interaction between syntax and semantics have traditionally involved writing semantic interpretation rules in 1-1 correspondence with phrase structure rules (e.g., Thompson, 1963), writing &quot;semantic grammars&quot; that integrate syntactic and semantic constraints in a single grammar (e.g., Burton, 1976), or writing ad hoe programs that combine such information in unformalized ways. The first approach requires as many syntactic rules as semantic rules, and hence is not really much different from the se1 There are some compensating disadvantages if the semantic domain is more complex than the syntactic one, but we will assume here that immediate semantic feedback is desired.</Paragraph> <Paragraph position="1"> American Journal of Computational Linguistics, Volume 6, Number 1, January-March 1980 3 William A. Woods Cascaded ATN Grammars <ATN> -> (<machinename> (accepts <phrasetype>*) <statespec>*) ;an ATN is a list consisting of a machine name, a ;specification of the phrasetypes which it will ;accept, and a list of staPSe specifications.</Paragraph> <Paragraph position="2"> <statespec> -> (<statename> {optional <initialspec>} <arc>*) <initialspec> -> (initial <phrasetype>*) ;indicates that this state ;is an initial state for the indicated phrgsetypes.</Paragraph> <Paragraph position="3"> <arc> -> (<phrasetype> <nextstate> <act>*) ;a transition that ;consumes a phrase of indicated type.</Paragraph> <Paragraph position="4"> -> (<pattern> <nextstate> <act>*) ;a transition that consumes ;an input element that matches a pattern.</Paragraph> <Paragraph position="5"> -> (J <nextstate> <act>*) ;a transition that jumps to a new ;state without consuming any input.</Paragraph> <Paragraph position="6"> -> (POP <phrasetype> <form>) ;indicates a final state ;for the indicated phrase type and specifies ;a form to be returned as its structure.</Paragraph> <Paragraph position="8"> ;specifies next state for a transition.</Paragraph> <Paragraph position="9"> ;matches a list whose elements match ;the successive specified patterns.</Paragraph> <Paragraph position="10"> ;matches any word in the list.</Paragraph> <Paragraph position="11"> ;matches any element.</Paragraph> <Paragraph position="12"> ;matches any subsequence.</Paragraph> <Paragraph position="13"> ;matches value of <form>.</Paragraph> <Paragraph position="14"> ;matches anything that has or inherits ;the class name as a feature.</Paragraph> <Paragraph position="15"> <wordlist> -> {'<word> I '<word>, <wordlist>} <act> -> (transmit <form>) ;transmit value of form as an output. -> (setr <registername> <form>) ;set register to value of form. -> (addr <registername> <form>) ;add the value of form to the ;end of the list in the indicated register (assumed ;initially NIL when the register has not been set). -> (require <proposition>) ;abort path if proposition is false. -> (dec <flaglist>) ;set indicated flags.</Paragraph> <Paragraph position="16"> -> (req <flagproposition>) ;abort path if proposition is false. -> (once <flag>) ;equivalent to (req (not <flag>)) (dec <flag>). <flagproposition> -> <boolean combination of flag registers> <proposition> -> <form> ;the proposition is false if the value ;of the form is NIL.</Paragraph> <Paragraph position="17"> <form> -> !<registername> ;returns contents of the register. -> '<liststructure> ;returns a copy of a list structure ;except that any expressions preceded by ! are ;replaced by their value and any preceded ;by @ have their value inserted as a sublist.</Paragraph> <Paragraph position="18"> -> !c ;contents of the current constituent register. -> !<liststructure> ;returns value of list structure ;interpreted as a functional expression.</Paragraph> <Paragraph position="19"> mantic grammar approach (this is the conventional way of defining semantics of programming languages). The second approach has the tendency to miss generalities and its results do not automatically extend to new domains. It misses syntactic generalities, for example, by having to duplicate the syntactic information necessary to characterize the determiner structures of noun phrases for each of the different semantic kinds of noun phrase that can be accepted. Likewise, it tends to miss semantic generalizations by repeating the same semantic tests in various places in the grammar when a given semantic constituent can occur in various places in a sentence. The third approach, of course, may yield some level of operational system, but does not usually shed any light on how such interaction should be organized, and is difficult to extend.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 American Journal of Computational Linguistics, Volume 6, Number 1, January-March 1980 </SectionTitle> <Paragraph position="0"> William A. Woods Cascaded ATN Grammars Rusty Bobrow's RUS parser (Bobrow, 1978) is the first parser to my knowledge to make a clean separation between syntactic and semantic specification while gaining the benefit of early and incremental semantic filtering and maintaining the factoring advantages of an ATN. It can be characterized as a cascade of two ATN's - one doing syntactic analysis and one doing semantic interpretation. Such a cascade of ATN's provides a way to reduce having to say the same thing multiple times or in multiple places, while providing efficiency comparable to a &quot;semantic&quot; grammar and at the same time maintaining a clean separation between syntactic and semantic levels of description. It is essentially a mechanism for permitting decomposition of an ATN grammar into an assembly of cooperating ATN's, each with its own characteristic domain of responsibility.</Paragraph> <Paragraph position="1"> As mentioned previously, a CATN is a sequence of ordinary ATN's that include among the actions on their arcs an operation TRANSMIT, which transmits an element to the next machine in the sequence. The first machine in the cascade takes its input from the input sequence, and subsequent machines take their input from the TRANSMIT commands of the previous ones. The output of the final machine in the cascade is the output of the machine as a whole. The only feedback from later stages to earlier ones is a filtering function that causes paths of the nondeterministic computation to die if a later stage cannot accept the output of an earlier one.</Paragraph> <Paragraph position="2"> The conception of cascaded ATN's arose from observing the interaction between the lexical retrieval component and the &quot;pragmatic&quot; grammar of the HWIM speech understanding system (Woods et al., 1976). The lexical retrieval component made use of a network that consumed successive phonemes from the output of an acoustic phonetic recognizer and grouped them into words. Because of phonological effects across word boundaries, this network could consume several phonemes that were part of the transition into the next word before determining that a given word was possibly present. At certain points, it would return a found word together with a node in the network at which matching should begin to find the next word (essentially a state remembering how much of the next word has already been consumed due to the phonological word boundary effect). This can be viewed as an ATN that consumes phonemes and transmits words as soon as its has enough evidence that the word is there.</Paragraph> <Paragraph position="3"> The lexical retrieval component of HWIM can thus be viewed as an ATN whose output drives another ATN. This led to the conception of a complete speech understanding system as a cascade of ATN's, one for acoustic phonetic recognition, one for lexical retrieval (word recognition), one for syntax, one for semantics, and one for subsequent discourse tracking. A predecessor of the RUS parser (Bobrow, 1978) was subsequently perceived to be an instance of a syntax/semantics cascade, since the semantic structures that it was obtaining from the lexicon to filter the paths through the grammar could be viewed as ATN's. Hence, practical solutions to problems of combinatorics in two different problem areas have independently motivated computation structures that can be viewed as cascaded ATN's. It remains to be seen how effectively cascades can be used to model acoustic phonetic recognition or to track discourse structure, but the possibilities are intriguing.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Specification of a CATN Computation </SectionTitle> <Paragraph position="0"> As with ordinary ATN's and other formal automata, the specification of the computation of a CATN will consist of the specification of an instantaneous &quot;configuration&quot; of the automaton and the specification of a transition function that computes possible successor configurations for any given configuration. Since CATN's are nondeterministic, a given configuration can in general have more than one successor configuration and may occasionally have no successor. One way to implement a parser for CATN's would be to explicitly mimic this formal specification by implementing the configurations as data structures and writing a program to implement the transition function. Just as for ordinary ATN's, however, there are also many other ways to organize a parser, with various efficiency tradeoffs.</Paragraph> <Paragraph position="1"> A configuration of a CATN consists of a vector of state configurations of the successive machines, plus a pointer to the input string where the first machine is about to take input. The transition function (nondeterministic) operates as follows: I. A distinguished register C is set (possibly nondeterministically) to the next input element to be consumed and the pointer in the input string is advanced. Then a stage counter k is set to 1.</Paragraph> <Paragraph position="2"> 2. The state of the kth machine in the sequence is used to determine a set of arcs that may consume the current input (possibly following a sequence of JUMPs, PUSHes, and POPs to reach a consuming transition).</Paragraph> <Paragraph position="3"> 3. Whenever a transmission operation TRANSMIT is executed by the stage k machine, the stage k+ 1 configuration is activated to process that input, and the stage k+ 1 component of the configuration vector is updated accordingly. If the k+l stage cannot accept the transmitted structure, the configuration is aborted.</Paragraph> <Paragraph position="4"> American Journal of Computational Linguistics, Volume 6, Number 1, January-March 1980 5 William A. Woods Cascaded ATN Grammars As for a conventional ATN, the format of the state configurations of the individual machines consist of a state name, a set of registers and contents, and a stack pointer (or its equivalent). 2 Each element of a stack is a pair consisting of a PUSH arc and a set of register contents. Transitions within a single stage are the same as for ordinary ATN's.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Uses of CATN's </SectionTitle> <Paragraph position="0"> A good illustrative example of the use of cascaded ATN's for natural language understanding would be a three stage machine consisting of a first stage that performs lexical analysis, a second stage for syntactic analysis, and a third stage for semantic analysis. The lexical stage ATN would consume letters from an input sequence and perform word identification, including inflectional analysis, tense extraction (e.g., BEEN => PASTPART BE), decomposition of contractions, and aggregation of compound phrases, producing as its output a sequence of words with syntactic categories and feature values. This machine could also perform certain standard bottom-up, locally determined parsings such as constructing noun phrase structures for proper nouns and pronouns. Ambiguity in syntactic class, in word grouping, and in homographs within a syntactic class can all be taken care of by the non-determinism of this first stage machine (e.g., &quot;saw&quot; as a past tense of &quot;see&quot; vs present tense of &quot;saw&quot; can be treated by two different alternative outputs of the lexical stage).</Paragraph> <Paragraph position="1"> This lexical stage machine is not likely to involve any recursion, unlike other stages of the cascade, but does use its registers to perform a certain amount of buffering before deciding what to transmit to the next stage. Because stages such as this one will reach states where they have essentially finished with a particular construction and are ready to begin a new one, a convenient action to have available on their arcs is one to reset all or a specified set of registers to their initial empty values again. Such register clearing is similar to that which happens on a push to a lower level, except that here the previous values need not be saved. The use of a register clearing action thus has the desired effect without the expense of a push.</Paragraph> <Paragraph position="2"> The second stage machine in our example will perform the normal phrase grouping functions of a syntactic grammar and produce TRANSMIT commands when it has identified constituents that are serving specific syntactic roles. The third stage machine will consume such constituents and incorpo2 For example, Earley's algorithm for context free grammars (Earley, 1968) replaces the stack pointer with a pointer to a place where the configuration(s) that caused the push can be found. A similar technique can be used with ATN grammars.</Paragraph> <Paragraph position="3"> rate them into an incremental interpretation of the utterance (and may also produce differential likelihoods for alternative interpretations depending on the semantic and pragmatic consistency and plausibility of the partial interpretation).</Paragraph> <Paragraph position="4"> The advantage of having a separate stage for the semantic interpretation, in addition to providing a clean separation between syntactic and semantic levels of description and a more domain-independent syntactic level, is that during the computation, different partial semantic interpretations that have the same initial syntactic structure share the same syntactic processing. In a single &quot;semantic&quot; ATN, such different semantic interpretation possibilities would have to make their own separate syntactic/semantic predictions with no sharing of the syntactic commonality between those predictions. Cascaded ATN'S avoid this while retaining the benefit of strong semantic constraint.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.3 Benefits of CATN's </SectionTitle> <Paragraph position="0"> The decomposition of a natural language analyzer into a cascade of ATN's gains a &quot;factoring&quot; advantage similar to that which ATN's themselves provide with respect to ordinary phrase structure grammars.</Paragraph> <Paragraph position="1"> Specifically, the cascading allows alternative configurations in the later stages of the cascade to share common processing in the earlier stages that would otherwise have to be done independently. That is, if several semantic hypotheses can use a certain kind of constituent at a given place, there need be only one syntactic process to recognize it. 3 Cascades also provide a simpler overall description of the acceptable input sequences than a single monolithic ATN combining all of the information into a single network would give. That is, if any semantic level process can use a certain kind of constituent at a given place, then there need be only one place in the syntactic stage ATN that will recognize it. Conversely, if there are several syntactic contexts in which a constituent filling a given semantic role can be found, there need be only one place in the semantic ATN to receive that role. (A single network covering the same facts would be expected to have a number of states on the order of the product, rather than the sum, of the numbers of states in the individual stages of the cascade.)</Paragraph> </Section> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 One might ask at this point whether there are situa- </SectionTitle> <Paragraph position="0"> tions in which one cannot tell what is present locally without &quot;top-down&quot; guidance from later stages. In fact, any such later stage guidance can be implemented by semantic filtering of syntactic possibilities. For example, if there is a given semantic context that permits a constituent construction that is otherwise not legal, one can still put the recognition transitions for that construction into the syntactic ATN with an action on the first transition to check compatibility with later stage expectations (e.g., by transmitting a flag indicating that it is about to try to recognize this special construction).</Paragraph> </Section> class="xml-element"></Paper>