File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/85/p85-1009_metho.xml

Size: 13,011 bytes

Last Modified: 2025-10-06 14:11:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="P85-1009">
  <Title>REVERSIBLE AUTOMATA AND INDUCTION OF THE ENGLISH AUXILIARY SYSTEM</Title>
  <Section position="4" start_page="0" end_page="70" type="metho">
    <SectionTitle>
LEARNING K-REVERSIBLE LANGUAGES
FROM EXAMPLES
</SectionTitle>
    <Paragraph position="0"> The question we address is, If a learner presumes that a natural language domain is systematic in some way, can the learner intelligently infer the complete system from only --- subset of sample sentences? Let us develop a:i exaauple to formally describe what we mean by &amp;quot;systematic in some way,&amp;quot; and how such a systematic domain allows the inference of a complete system front examples. If you were told that Mar~ bakes cakes, John bakes cakes, and Mar V eat~ pies are legal strings m some language, you might guess that John eats pies is also in that language. Strings in the language seem to follow a recognizable pattern, so you expect other strings that follow the same pattern to be in the language also.</Paragraph>
    <Paragraph position="1"> In this particular case, you axe presuming that the tobe-learned language is a zero-reversible regular language.</Paragraph>
    <Paragraph position="2"> Angluin (1982) has defined and explored the formal properties of reversible regular languages. We here translate some of her formal definitions into less technical terms.</Paragraph>
    <Paragraph position="3"> A regular language is any language that can be generated from a formula called a regular expression. For example the strings mentioned above might have come from the language that the following regular expression generates: (MarylJohu) (bakes6eats) livery* delicious\] (cakeslpies)\] A complete natural language is too complex to be generated by some concise regular expression, but some simple subsets of a natural language can fit this kind of pattern.</Paragraph>
    <Paragraph position="4"> To formally define when a regular language is reversible, let us first define a prefix as any substring (possibly zero-</Paragraph>
  </Section>
  <Section position="5" start_page="70" end_page="70" type="metho">
    <SectionTitle>
SEQUENCE OF NEW NEW STRINGS INFERRED:
</SectionTitle>
    <Paragraph position="0"> length) that can be found at the very beginning of some legal string in a language, and a suffix as any substring (again, possibly zero-length) that can be found at the very end of some legal string in a language. In our case the strings ~e sequences of words, and the langamge is the set of all legal sentences in our simplified subset of English. Also, in any legal string say that the surtax that immediately follows a prefix is a tail for that prefix. Then a regular language is zero-reverstble if whenever two prefixes in the language have a tail in common, then the two prefixes have all tails in common.</Paragraph>
    <Paragraph position="1"> In the above example prefixes Mary and John have the tail bakes cakes in common. If we presume that the language these two strings come from is zero-reversible, then Ma~ and John must have all tails in common. In particular, the third string shows that Mary has eats pies as a tail, so John must also have eats pies as a tail. Our current hypothesis after having seen these three strings is that they come not from the three-string language expressed by (Mar~tiJohn) bakes cakes i Mary eats p:es, which is not zero-reversible, but rather from the four-string language (MarytJohn) (bakes cakes ! eats pies), which is zero-reversible. Notice that we have enlarged the corpus just enough to make the language zero-reversible.</Paragraph>
    <Paragraph position="2"> A regular language is k-reversible, where k is a non-negative integer, if whenever two prefixes whose l~t k tuorda match have a tail in common, then the two prefixes have all tails in common. A higher value of k gives a more conservative condition for inference. For example, i/we presume that the aforementioned strings come from a l-reversible language, then instead of presuming that whatever Mary does John does, we would presume only that whatever Mary bakes, John bakes. In this case the third string fails to yield any inference, but if we were later told that Mary bakes pies is in the language, we could infer that John bakes pies is also in the language. Further adding the sentence Mary bakes would allow 1-reversible inference to also induce John bakes, resulting in the seven-string 1-reversible language expressed by ( Maryldohn) bakes Icakesipiesi l Mary eats pies.</Paragraph>
    <Paragraph position="3"> With these examples zero-reversible inference would have generated ( MarylJohn) ( bakesieats) ( cakesipies)* by now, which overgeneralizes an optional direct object into zero or more direct objects. On the other hand, tworeversible inference would have inferred no additional strings yet. For a particular language we hope to find a k that is small enough to yield some inference but not so small that we overgeneralize and start inferring strings that are in fact not in the true language we are trying to learn. Table 1 summarizes our examples of k-reversible inference.</Paragraph>
  </Section>
  <Section position="6" start_page="70" end_page="70" type="metho">
    <SectionTitle>
AN INFERENCE ALGORITHM
</SectionTitle>
    <Paragraph position="0"> In addition to formally characterizing k-reversible lan.</Paragraph>
    <Paragraph position="1"> guages, Angluin also developed an algorithm for inferring a k-reversible language from a finite set of positive exampies, an well an a method for discovering an appropriate k when negative examples (strings known not to be in the language) are also presented. She also presented an algorithm for determining, given some k-reversible regular language, a minimal set of examples from which the entire language  can be induced. We have implemented these procedures on a computer in MACLISP and have applied them to all of the artificial languages in Angluin's paper as well as to all of the natural language examples in this paper.</Paragraph>
    <Paragraph position="2"> To describe the inference algorithm, we make use of the fact that every regular language can be associated with a corresponding deterministic finite-state automaton (DFA) which accepts or generates exactly that language.</Paragraph>
    <Paragraph position="3"> Given a sample of strings taken from the full corpus, we first generate a prefix-tree automaton which accepts or generates exactly those strings and no others. We now want to infer additional strings so as to induce a/c-reversible language, for some chosen /C. Let us say that when accepting a string, the last k symbols encountered before arriving at a state is a ~c-leader of that state. Then to generalize the language, we recursively merge any two states where any of the following is true: *Another state arcs to both states on the same word.</Paragraph>
    <Paragraph position="4"> (This enforces determinism.) oBoth states have a common k-leader and either -both states are accepting states or -both states arc to a common state on the same word.</Paragraph>
    <Paragraph position="5"> When none of these conditions obtains any longer, the resuiting DFA accepts or generates the smallest k-reversible language that includes the original sample of strings. (The term ~reversible&amp;quot; is used because a ~c-reversible DFA is still deterministic with lookahead /C when its sets of initial and final states are swapped and Ml of its arcs are reversed.) This procedure works incrementally. Each new string may be added to the DFA in prefix-tree fashion and the state-merging algorithm repeated. The resulting language induced is independent of the order of presentation of sample strings.</Paragraph>
    <Paragraph position="6"> If an appropriate /C is not known a pr/o~', but some negative as well as positive examples are presented, then one can try increasing values of k until the induced language contains none of the negative examples.</Paragraph>
    <Paragraph position="7"> Though the inference algorithm takes a sample and induces a/c-reversible language, it is quite helpful to use Angluin's algorithm for going in the reverse direction: given a k- reversible language we can determine what minimal set of shortest possible examples (a &amp;quot;characteristic&amp;quot; or &amp;quot;covering n sample) will be sufficient for inducing the language. Though the minimal number of examples is of course unique, the set of particular strings in the covering sample is not necesm~rily Imique.</Paragraph>
  </Section>
  <Section position="7" start_page="70" end_page="73" type="metho">
    <SectionTitle>
INFERENCE OF THE ENGLISH AUXILIARY
SYSTEM
</SectionTitle>
    <Paragraph position="0"> We have chosen to test the English auxiliary system under /c-reversible inference because English verb sequences are highly regular, yet they have some degree of complexity and admit to some exceptions. We represent the English auxiliary system am a corpus of 92 variants of a declarative statement in third person singular. The variants cover all standard legal permutations of tense, aspect, and voice, including do support and nine models. We simply use the surface forms, which are strings of words with no additional information such as syntactic category or root-by-inflection breakdown. For instance, the present, simple, active example is Judy glvez bread. One modal, perfective, passive variant is Judy would have been given bread.</Paragraph>
    <Paragraph position="1"> We have explored the/c-reversible properties of this nat,iral language subsystem in two main steps. First we determined for what values of k the corpus is in fact k-reversible. (Given a finite corpus, we could be sure the language is /c-reversible for all /C at or above some value.) To do this we treated the full corpus as a set of sample strings and tried successively larger values of/C until finding one where /c-reversible inference applied to the corpus generates no additional strings. We could then be sure that any /C of that value or greater could be used to infer an accurate model of the English auxiliary system without overgeneralizing.</Paragraph>
    <Paragraph position="2"> After finding the range of values of/C to work with, we were interested in determining which, if any, of those values of/C would yield some power to infer the full corpus from a proper subset of examples. To do this we took the DFA which represents the full corpus and computed, for a trial k, a set of samp|e strings that would be minimally sufficient to induce the full corpus. If any such values of k exist, then we can say that, in a nontrivial way, the English auxiliary system is learnable as a k-reversible language from examples. null We found that the English auxiliary system can be faithfully modeled as a/c-reversible regular language for k &gt;_ I. Only zero-reversible inference overgeneralizes the full corpus as well as the active and passive corpora treated as separate languages. For the active corpus, zero-reversible inference groups the forms of do with the other modals. The DFAs for the passive and full corpora also contain loops and thereby generate infinite numbers of illegal variants.</Paragraph>
    <Paragraph position="3"> F:.gure I compares a correct DFA for the English auxiliary system with an overgeneralized DFA. Both are shown in a minimized, canonical form. The top, correct, automaton can be generated by either minimizing the prefix tree for the full corpus or by minhnizing the result of/c-reversible inference applied to any sufficiently characteristic set of sample sentences, for any /C _.&gt; 1. One can read off all 92 variants  in the language by taking different paths from initial state to final state. The bottom, overgeneralized, automaton is generated by subjecting the top one to zero-reversible infereuce, null Does treating the English auxiliary system as a I-ormore-reversible l,'mguage yield any inferential power? The English auxiliary system as a l-reversible language can in fact be inferred from a cover of only 48 examples out of the 92 variants in the corpus. The active corpus treated separately requires 38 examples out of 46 and the passive corpus requires 28 out of 46. Treating the full corpus as a 2-reversible language requires 76 examples, and a 3 &amp;quot;~reversible model cannot infer the corpus from any proper subset whatsoever.</Paragraph>
    <Paragraph position="4"> For l-reversible inference, 45 of the verb sequences of length three or shorter will yield the remaining nine such strings and nonc longer. Verb sequences of length four or five can be divided into two patterns, &lt;modal&gt; have been 9iv(ing,,en) ,'wad ... be, en} bern9 given. Adding any one (length-four) string from the first pattern will yield the remaining 17 strings of that pattern. Further adding two length-four strings from the awkward second pattern will yield the remaining 18 strings of that pattern, nine of which are of length five. This completes the corpus.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML