File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-1233_metho.xml
Size: 14,189 bytes
Last Modified: 2025-10-06 14:15:15
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-1233"> <Title>Introducing MegaHAL</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 The Loebner Contest </SectionTitle> <Paragraph position="0"> Apart from a few limited tests performed by programmers of conversation simulators (Colby, 1981), the Turing test was not formally conducted until 1995. Although the inaugural Loebner contest, held in 1991, was touted as the first formal instantiation of the Turing test, it was not until 1995 that it truly satisfied Turing's original specifications (Hutchens, 1996).</Paragraph> <Paragraph position="1"> The first Loebner contest was held on the 8 th of November 1991 in Boston's Computer Museum.</Paragraph> <Paragraph position="2"> Because this was a contest rather than an experiment, six computer programs were accepted as subjects. Four human subjects and ten judges were selected from respondents to a newspaper advertisement; none of them had any special expertise in Computer Science (Epstein, 1992).</Paragraph> <Paragraph position="3"> The original Turing test involved a binary decision between two subjects by a single judge. With ten subjects and ten judges, the situation was somewhat more complex. After months of deliberation, PShe prize committee developed a suitable scoring mechanism. Each judge was required to rank the subjects from least human-like to most human-like, and to mark the point at which they believed the subjects switched from computer programs to human beings.</Paragraph> <Paragraph position="4"> If the median rank of a computer program exceeded the median rank of at least one of the human subjects, then that computer program would win the grand prize of $100,000.1 If there was no grand prize winner, the computer program with the highest median rank would win the contest with a prize of $2000.</Paragraph> <Paragraph position="5"> 1Today the program must also satisfy audio-visual requirements to win the grand prize.</Paragraph> <Paragraph position="6"> Hutchens and Alder 271 Introducing MegaHal Jason L. Hutchens and Michael D. Alder (1998) Introducing MegaHal. In D.M.W. Powers (ed.) NeMLaP3/CoNLL98 Workshop on Human Computer Conversation, ACL, pp 271-274.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Conversation Simulators </SectionTitle> <Paragraph position="0"> Since its inception, the Loebner contest has primarily attracted hobbyist entries which simulate conversation using template matching; a method employed by Joseph Weizenbaum in his ELIZA conversation simulator, developed at MIT between 1964 and 1966.</Paragraph> <Paragraph position="1"> Put simply, these programs look for certain patterns of words in the user's input, and reply with a pre-determined output, which may contain blanks to be filled in with details such as the user's name.</Paragraph> <Paragraph position="2"> Such programs are effective because they exploit the fact that human beings tend to read much more meaning into what is said than is actually there; we are fooled into reading structure into chaos, and we interpret non-sequitur as whimsical conversation (Shieber, 1994).</Paragraph> <Paragraph position="3"> Weizenbaum was shocked at the reaction to ELIZA. He noticed three main phenomenon which disturbed him greatly (Weizenbaum, 1976): i. A number of practising psychiatrists believed that ELIZA could grow into an almost completely automatic form of psychotherapy.</Paragraph> <Paragraph position="4"> 2. Users very quickly became emotionally involved--Weizenbaum's secretary demanded to be left alone with the program, for example.</Paragraph> <Paragraph position="5"> 3. Some people believed that the program demonstrated a general solution to the problem of computer understanding of natural language.</Paragraph> <Paragraph position="6"> Over three decades have passed since ELIZA was created. Computers have become significantly more powerful, while storage space and memory size have increased exponentially. Yet, at least as far as the entrants of the Loebner contest go, the capabilities of conversation simulators have remained exactly where they were thirty years ago. Indeed, judges in the 1991 contest said that they felt let down after talking to the computer entrants, as they had had their expectations raised when using ELIZA during the selection process.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 MegaHAL </SectionTitle> <Paragraph position="0"> In 1996 the primary author entered the Loebner contest with an ELIZA variant named HeX, which was written during his spare time in under a month.</Paragraph> <Paragraph position="1"> Apart from the lure of the prize money, a major motivation for the entry was a desire to illustrate the shortcomings of the contest (Hutchens, 1996).</Paragraph> <Paragraph position="2"> A considerably more powerful program, SEPO, was entered the following year, where it was placed second. We believe this to be indicative of a gradual improvement in the quality of the contestants.</Paragraph> <Paragraph position="3"> The program submitted to this year's contest, MegaHAL, uses a significantly different method of simulating conversation than either HeX or SEPO, and we dedicate the remainder of this paper to describing its workings.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Markov Modelling </SectionTitle> <Paragraph position="0"> MegaHAL is able to construct a model of language based on the evidence it encounters while conversing with the user. To begin with, the input received from the user is parsed into an alternating sequence of words and non-words, where a word is a series of alphanumeric characters and a non-word is a series of other characters. This is done to ensure not only that new words are learned, but that the separators between them are learned as well. If the user has a habit of putting a double space after a full stop, for instance, MegaHAL will do just the same.</Paragraph> <Paragraph position="1"> The resulting string of symbols 2 is used to train two 4th-order Markov models (Jelinek, 1986). One of these models can predict which symbol will following any sequence of four symbols, while the other can predict which symbol will precede any such sequence. Markov models express their predictions as a probability distribution over all known symbols, and are therefore capable of choosing likely words over unlikely ones. Models of order 4 were chosen to ensure that the prediction is based on two words; this has been found necessary to produce output resembling natural language (Hutchens, 1994).</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Generating Candidate Replies </SectionTitle> <Paragraph position="0"> Using a Markov model to generate replies is easy; Shannon was doing much the same thing by flipping through books back in 1949 (Shannon and Weaver, 1949). However, such replies will often be nonsensical, and will bear no relationship to the user's input. MegaHAL therefore attempts to generate suitable replies by basing them on one or more keywords from the user's input. This explains why two Markov models are necessary; the first model generates a sentence from the keyword on, while the second model generates the remainder of the sentence, from the keyword back to the beginning.</Paragraph> <Paragraph position="1"> Keywords are obtained from the users input. Frequently occurring words, such as &quot;the&quot;, &quot;and&quot; and &quot;what&quot;, are discarded, as their presence in the input does not mean they need to be present in the output. The remaining words are transformed if necessary--&quot;my&quot; becomes &quot;your&quot; and &quot;why&quot; becomes &quot;because&quot;, for example. What remains is used to seed the output.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.3 Selecting a Reply </SectionTitle> <Paragraph position="0"> MegaHAL is able to generate many hundreds of candidate replies per second, each of which contain at least one keyword. Once a small time period has elapsed, the program must display a reply to the user. A method is needed for selecting a suitable reply out of the hundreds of candidates.</Paragraph> <Paragraph position="2"> MegaHAL chooses the reply which assigns the keywords the highest information. The information of a word is defined in Equation 1 as the surprise it causes the Markov model. Hence the most surprising reply is selected, which helps to guarantee its originality. Note that P(w\]s) is the probability of word w following the symbol sequence s, according to the Markov model.</Paragraph> <Paragraph position="3"> The algorithm for MegaHAL proceeds as follows: 1. Read the user's input, and segment it into an alternating sequence of words and non-words.</Paragraph> <Paragraph position="4"> 2. From this sequence, find an array of keywords and use it to generate many candidate replies.</Paragraph> <Paragraph position="5"> 3. Display the reply with the highest information to the user.</Paragraph> <Paragraph position="6"> 4. Update the Markov models with the user's in- null put.</Paragraph> <Paragraph position="7"> This sequence of steps is repeated indefinitely, which allows the program to learn new words, and sequences of words, as it converses with the user.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.4 Training MegaHAL </SectionTitle> <Paragraph position="0"> When MegaHAL is started it has no knowledge of language, and is unable to give a reply at all--the program needs to be trained using a source of text to ensure that it does not reveal its identity prematurely. A large corpus of training data was created for this purpose.</Paragraph> <Paragraph position="1"> The training data is made up of various texts: * Hand-crafted sentences designed in order to create a personality for MegaHAL, including sentences containing a false name, age and occupation. null Encyclopaedic information taken from the Web, on topics such as geography, music, sports, movies and history.</Paragraph> <Paragraph position="2"> A selection of sentences picked from transcripts of previous Loebner contests.</Paragraph> <Paragraph position="3"> * Lines of dialogue taken from scripts for movies and television shows.</Paragraph> <Paragraph position="4"> * Lists of popular quotations.</Paragraph> <Paragraph position="5"> * A small amount of text in languages other than English.</Paragraph> <Paragraph position="6"> When MegaHAL is trained using this data, it is able to respond to questions on a variety of topics. It is hoped that the program will also learn new topics from the judges, although this remains to be seen.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.5 Online Experimentation </SectionTitle> <Paragraph position="0"> MegaHAL has been available on the Web since early in 1997, and hundreds of users converse with it every day. It is an interesting fact that one never tires of reading transcripts of conversation, due to MegaHAL's ability to respond with original replies.</Paragraph> <Paragraph position="1"> Many users are often offended by the things MegaHAL says, and some believe that they have been personally insulted. A user named Forrest was quite incensed when the program began quoting parts of the Forrest Gump screenplay back at him. That a computer program can cause such an emotional response in a human being is interesting, although it may say more about the human being than it does about the program.</Paragraph> <Paragraph position="2"> Users are often impressed with MegaHAL's ability to learn. One user was annoyed that the program had learned more about his personal life than he would care it to know, while another stated that MegaHAL would eventually grow into a person of average intellect (he attributed this bold claim to the law of averages). A person experienced working with people in psychotic crises likened talking to MegaHAL with talking to a psychotic.</Paragraph> <Paragraph position="3"> Users have successfully taught the program to respond to sentences in French, Spanish, Greek, German, Italian, Latin, Japanese and Hebrew, amongst others. A clergyman spent hours teaching MegaHAL about the love of Jesus, only to constantly receive blasphemous responses.</Paragraph> <Paragraph position="4"> The reaction of Web user's to the program has been surprising, and is pretty much what Weizenbanm experienced with ELIZA. MegaHAL generates gibberish mostly; but occasionally, by pure coincidence, it will reply appropriately, and in context. It is these occasions that stick in the mind, and gi~ve cause for over-zealous claims of computational intelligence. null</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.6 Example Interaction </SectionTitle> <Paragraph position="0"> As an example of MegaHAL at its best, we reproduce a few extracts from a conversation which took Hutchens and Alder 273 Introducing MegaItal place over a period of three hours in mid 1997 between MegaHAL and an anonymous Web user.</Paragraph> <Paragraph position="1"> To begin with, the user was able to successfully teach the program some basic facts: X-RAYS.</Paragraph> <Paragraph position="2"> He then discovered that the program is an expert at being nonsensical. Even so, MegaHAL was still able to give some appropriate responses, due to the keyword mechanism for generating replies:</Paragraph> </Section> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> JEUNNE FILLE </SectionTitle> <Paragraph position="0"> In general MegaHAL's conversations are not as successful as this. Most users are satisfied with typing in rude words to see how the program responds.</Paragraph> </Section> class="xml-element"></Paper>