File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/84/p84-1116_metho.xml
Size: 26,708 bytes
Last Modified: 2025-10-06 14:11:41
<?xml version="1.0" standalone="yes"?> <Paper uid="P84-1116"> <Title>Machine Translation: its History, Current Status, and Future Prospects</Title> <Section position="4" start_page="551" end_page="551" type="metho"> <SectionTitle> LOGOS </SectionTitle> <Paragraph position="0"> Development of the LOGOS system was begun in 1964.</Paragraph> <Paragraph position="1"> The first installation, in 1971, was used by the U.S. Air Force to translate English maintenance manuals for military equipment into Vietnamese.</Paragraph> <Paragraph position="2"> Due to the termination of U.S. involvement in that war, and perhaps partly to a poor evaluation of LOGOS&quot; cost-effectiveness \[Sinaiko and Xlare, 73\], its use was ended after two years. As with SYSTRAN, the linguistic foundations of LOGOS are weak and inexplicit (they appear to involve dependency structures); and the analysis and synthesis rules, though separate, seem to be designed for particular source and target languages, limiting their extensibility.</Paragraph> <Paragraph position="3"> LOCOS continued to attract customers. In 1978, Siemens AG began funding the development of a LOGOS German-English system for telecommunications manuals. After three years LOCOS delivered a &quot;production&quot; system, but it was not found suitable for use (due in part to poor quality of the translations, and in part to the economic situation within Siemens which had resulted in a much-reduced demand for translation, hence no immediate need for an MT system). Eventually LOGOS forged an agreement with the Wang computer company which allowed LOGOS to implement the German-English system (formerly restricted to large IBM mainframes) on Wang office computers. This system is being marketed today, and has recently been purchased by the Commission of the European Communities. Development of other language pairs has been mentioned from time to time.</Paragraph> </Section> <Section position="5" start_page="551" end_page="552" type="metho"> <SectionTitle> METEO </SectionTitle> <Paragraph position="0"> TAUM-METEO is the world's only example of a truly fully-automatic MT system. Developed as a spin-off of the TAUM technology, as discussed earlier, it was fully integrated into the Canadian Meteorological Center's (CMC's) nation-wide weather communications network by 1977. METEO scans the network traffic for English weather reports, translates them &quot;directly&quot; into French, and sends the translations back out over the communications network automatically. Rather than relying on post-editors to discover and correct errors, METEO detects its own errors and passes the offending input to human editors; output deemed &quot;correct&quot; by METEO is dispatched without human intervention, or even overview.</Paragraph> <Paragraph position="1"> TAUM-METEO was probably also the first MT system where translators were involved in all phases of the design/development/refinement; indeed, a CMC translator instigated the entire project. Since the restrictions on input to METEO were already in place before the project started (i.e., METEO imposed no new restrictions on weather forecasters), METEO cannot quite be classed with the TITUS and Xerox SYSTRAN systems which rely &quot;on restrictions geared to the characteristics of those MT systems. But METEO is not extensible.</Paragraph> <Paragraph position="2"> One of the more remarkable side-effects of the METEO installation is that the translator turn-over rate within the CMC went from 6 ~nths, prior to METEO, to several years, once the CMC translators began to trust METEO's operational decisions and not review its output \[Brian Harris, personal communication\]. METEO's input constitutes over 11,000 words/day, or 3.5 million words/year. Of this, it correctly translates 80%, shuttling the other ('bore interesting&quot;) 20% to the human CMC translators; almost all of these &quot;analysis failures&quot; are attributable to violations of the CMC language restrictions, though some are due to the inability of the system to handle certain constructions. METEO's computational requirements total about 15 CPU minutes per day on a CDC 7600 \[Thouin, 82\]. By 1981, it appeared that the built-in limitations of METEO's theoretical basis had been reached, and further improvement was not possible.</Paragraph> <Section position="1" start_page="551" end_page="552" type="sub_section"> <SectionTitle> Weidner Communications Systems, Inc. </SectionTitle> <Paragraph position="0"> Weidner was established in 1977 by Bruce Weidner, who hired a group of FIT workers (predominantly programmers) from the fading BYU project. Weidner delivered a production English-French system to Mitel in Canada in 1980, and a beta-test English-Spanish system to the Siemens Corporation (USA) in the same year. In 1981 Mite1 took delivery on Weidner's English-Spanish and English-German systems, and Bravice (a translation service bureau in Japan) purchased the Weidner English-Spanish and Spanish-English systems. To date, there are about 22 installations of the Weidner MT system around the world. The Weidner system, though &quot;fully automatic&quot; during translation, is marketed as a &quot;machine aid&quot; to translation (perhaps to avoid the stigma usually attached to MT). It is highly interactive for other purposes (the lexical pre-analysis of texts, the construction of dictionaries, etc.), and integrates word-processing software with external devices (e.g., the Xerox 9700 laser printer at Mitel) for enhanced overall document production.</Paragraph> <Paragraph position="1"> Thus, the Weidner system accepts a formatted source document (actually, one containing formatting/typesetting codes) and produces a formatted translation. This is an important feature to users, since almost everyone is interested in producing formatted translations from formatted source texts.</Paragraph> <Paragraph position="2"> Given the way this system is tightly integrated with moaern word-processing technology, it is difficult to assess the degree to which the translation component itself enhances translator productlvity, vs. the degree to which simple automation of formerly manual (or poorly automated) processes accounts for the productivity gains. The &quot;direct&quot; translation component itself is not particularly sophisticated. For example analysis is &quot;local,&quot; being restricted to the noun phrase or verb phrase level -- so that context available only at higher levels can never be taken into account.</Paragraph> <Paragraph position="3"> Translation is performed in four independent stages: idiom search, homograph disambiguation, structural analysis, and transfer. These stages do not interact with each other, which creates more problems; for example, an apparent idiom in a text is always treated idiomatically -- never literally, no matter what its context (since no other contextual information is available until later).</Paragraph> <Paragraph position="4"> Hundt \[82\] comments that &quot;idioms are an extremely important part of the translation procedure.&quot; It is particularly interesting that he continues: &quot;...machine assisted translation is for the most part word replacement...&quot; Then, &quot;It is not worthwhile discussing the various problems of the \[Weidner\] system in great depth because in the first place they are much too numerous...&quot; Yet even though the Weidner translations are of low quality, users nevertheless report economic satisfaction with the results. Hundt continues &quot;...the Weidner system indeed works as an aid...&quot; and, &quot;800 words an hour as a final figure \[for translation throughput\] is not unrealistic.&quot; This level of performance was not attainable with previous \[human\] methods, and some users report the use of Weidner to be cost-effective, as well as faster, in their enviroements.</Paragraph> <Paragraph position="5"> In 1982, Weidner delivered English-German and German-English systems to ITT in Great Britain; but there were some financial problems (a third of the employees were laid off that year) until a controlling interest was purchased by a Japanese company: Bravice, one of Weidner's customers, owned by a group of wealthy Japanese investors. Weidner continues to market }iT systems, and is presently working to develop Japanese MT systama. A prototype Japanese-English system has recently been installed at Bravice, and work continues on an English-Japanese system. In addition, Weidner has implemented its systam on the IBM Personal Computer, in order to reduce its former dependence on the PDP-II.</Paragraph> </Section> </Section> <Section position="6" start_page="552" end_page="554" type="metho"> <SectionTitle> SPANAM </SectionTitle> <Paragraph position="0"> Following a promising feasiblity study, the Pan American Health Organization in Washington, D.C.</Paragraph> <Paragraph position="1"> decided in 1975 to undertake work on a machine translation system, utilizing many of the same techniques developed for GAT; consultants were hired from nearby Georgetown University, the home of GAT. The official PAHO languages are English, French, Portuguese, and Spanish; Spanish-English was chosen as the initial language pair, due to the belief that &quot;This combination requires fewer parsing strategies in order to produce manageable output \[and other reasons relating to expending effort on software rather than linguistic rules\]&quot; \[Vasconcellos, 83\]. Actual work started in 1976, and the first prototype was running in 1979, using punched card input on an IBM mainframe. With the subsequent integration of a word processing system, production use could be seriously considered.</Paragraph> <Paragraph position="2"> After further upgrading, the system in 1980 was offerred as a service to potential users. Later that year, in its first major test, SPANAM reduced manpower requirements for a certain translation effort by 45~, resulting in a monetary savings of 61Z \[Vasconcellos, 83\]. Since then it has been used to translate well over a million words of text, averaging about 4,000 words per day per post-editor. (Significantly, SPANAM's in-house developers seem to be the only revisors of its output.) The post-editors have amassed &quot;a bag of tricks&quot; for speeding the revision work, and special string functions have also been built into the word processor for handling SPANAM's English output.</Paragraph> <Paragraph position="3"> Sketchy details imply that the linguistic technology underlying SPANAM is essentially that of GAT; the rules may even still be built into the programs. The software technology has been updated considerably in that the programs are modular (in the newest version). The total lack of sophistication by modern Computational Linguistics standards is evidenced by the offhand remark that &quot;The maximum length of an idiom \[allowed in the dictionary\] was increased from five words to twenty-five&quot; in 1980 \[Vasconcellos, 83\]. Also, the system adopts the &quot;direct&quot; translation strategy, and fails to attempt a &quot;global&quot; analysis of the sentence, settling for &quot;local&quot; analysis of limited phrases. The SPANAM dictionary currently numbers 55,000 entries. A follow-on project to develop ENGSPAN, underway since 1981, has produced some test translations.</Paragraph> <Paragraph position="4"> CULT - Chinese University Language Translator CULT is perhaps the most successful of the Machine-aided Translation systems. Development began at the Chinese University of Hong Kong around 1968. CULT translates Chinese mathematics and physics journals (published in Beijing) into English through a highly-interactive process \[or, at least, with a lot of human intervention\]. The goal was to eliminate post-editing of the results by allowing a large amount of pre-editing of the input, and a certain \[unknown\] degree of human intervention during translation. Although published details \[Loh, 76, 78, 79\] are not unambiguous, it is clear that humans intervene by marking sentence and phrase boundaries in the input, and by indicating word senses where necessary, among other things. (What is not clear is whether this is strictly a pre-editing task, or an interactive task.) CULT runs on the ICL 1904A computer.</Paragraph> <Paragraph position="5"> Beginning in 197~, the CULT system was applied to the task of translating the Acta Mathematica Sinica into English; in 1976, this was joined by the Acta Physica Sinlca. This production translation practice continues to this day. Originally the Chinese character transcription problem was solved by use of the standard telegraph codes invented a century ago, and the input data was punched on cards. But in 1978 the system was updated by the addition of word-processing equipment for on-line data entry and pre/post-editing.</Paragraph> <Paragraph position="6"> It is not clear how general the techniques behind CULT are -- whether, for example, it could be applied to the translation of other texts -- nor how cost-effective it is in operation. Other factors may justify its continued use. It is also unclear whether R&D is continuing, or whether CULT, like METEO, is unsuited to design modification beyond a certain point already reached. In the absence of answers to these questions, and perhaps despite them, CULT does appear to be an MAT success story: the amount of post-editing said to be required is trivial -- limited to the re-introduction of certain untranslatable formulas, figures, etc., into the translated output. At some point, other translator intervention is required, but it seems to be limited to the manual inflection of verbs and nouns for tense and number, and perhaps the introduction of a few function words such as English determiners.</Paragraph> <Paragraph position="7"> ALPS - Automated Language Processing Systems ALPS was incorporated by another group of Brigham Young University workers, around 1979; while the group forming Weidner was composed mostly of the programmers interested in producing a fully-automatic MT system, the group forming ALPS (reusing the old BYU acronym) was composed mostly of linguists interested in producing machine aids for human translators (dictionary look-up and substitution, etc.) \[Melby and Tenney, personal communication\]. Thus the ALPS system is interactive in all respects, and does not seriously pretend to perform translation at all; rather, ALFS provides the translator with a set of software tools to automate many of the tasks encountered in everyday translation experience. ALPS adopted the tools originally developed at BYU -- and hence, the language pairs the BYU system had supported: English into French, German, Portuguese, and Spanish. Since then, other languages (e.g., Arabic) have been announced, but their commercial status is unclear.</Paragraph> <Paragraph position="8"> The ALPS system is intended to work on any of three &quot;levels&quot; -- providing capabilities from simple dictionary lookup on demand to word-for-word (actually, term-for-term) translation and substitution into the target text. The central tool provided by ALPS is a menu-driven word-processing system coupled to the on-line dictionary. One of the first ALPS customers seems to have been Agnew TechTran -- a commercial translation bureau which acquired the ALP$ system for in-house use. Recently, another change of ownership and consequent shake-up at Weidner communication Systems, Inc., has allowed ALPS to hire a large group of former Weidner workers, leading to speculation that ALPS might itself be intending to enter the MT arena.</Paragraph> <Section position="1" start_page="553" end_page="554" type="sub_section"> <SectionTitle> Current Research and Development </SectionTitle> <Paragraph position="0"> In addition to the organizations marketing or using existing M(A)T systems, there are several groups engaged in on-going R&D in this area. Operational (i.e., marketed or used) systems have not yet resulted from these efforts, but deliveries are foreseen at various times in the future. We will discuss the major Japanese MT efforts briefly (as if they were unified, in a sense, though for the most part they are actually separate), and then the major U.S. and European MT systems at greater length.</Paragraph> <Paragraph position="1"> MT R&D in Japan In 1982 Japan electrified the technological world by widely publicizing their new Fifth Generation project and establishing the Institute for New Generation Computer Technology (ICOT) as its base. Its goal is to leapfrog Western technology and place Japan at the forefront of the digital electronics world in the 1990&quot;s. MITI (Japan's Ministry of International Trade and Industry) is the motivating force behind this project, and intends that the goal be achieved through the development and application of highly innovative techniques in both computer architecture and Artificial Intelligence.</Paragraph> <Paragraph position="2"> Of the research areas to be addressed by the ICOT scientists and engineers, Machine Translation plays a prominent role. Among the western Artificial Intelligentsia, the inclusion of D~ seems out of place: AI researchers have been trying (successfully) to ignore all MT work in the two decades since the ALPAC debacle, and almost universally believe that success is impossible in the foreseeable future -- in ignorance of the successful, cost-effective applications already in place. To the Japanese leadership, however, the inclusion of D~ is no accident. Foreign language training aside, translation into Japanese is still one of the primary means by which Japanese researchers acquire information about what their Western competitors are doing, and how they are doing it. Translation out of Japanese is necessary before Japan can export products to its foreign markets, because the customers demand that the manuals and other documentation not be written only in Japanese. The Japanese correctly view translation as necessary to their technological survival, but have found it extremely difficult to accomplish by human means. Accordingly, their government has sponsored MT research for several decades. There has been no rift between AI and D~ researchers in Japan, as there has been in the West -- especially in the U.S. MT may even be seen as the key to Japan's acquisition of enough Western technology to train their scientists and engineers, and thus accomplish their Fifth Generation project goals.</Paragraph> <Paragraph position="3"> Nemura \[82\] nembers the MT R&D groups in Japan at more than eighteen. (By contrast, there might be a dozen significant MT groups in all of the U.S. and Europe, including commercial vendors.) Several of the Japanese projects are quite large. (By contrast, only one MT project in the western world \[EUROTRA\] even appears as large, but most of the 80 individuals involved work on EUROTRA only a fraction of their time.) Most of the Japanese projects are engaged in research as much as development. (Most Western projects are engaged in development.) Japanese progress in MT has not come fast: until a few years ago, their hardware technology was inferior; so was their software competence, but this situation has been changing rapidly. Another obstacle has been the great differences between Japanese and Western languages -~ especially English, which is of greatest interest to them -- and the relative paucity of knowledge about these differences. The Japanese are working to eliminate this ignorance: progress has been made, and production-quality systems already exist for some applications. None of the Japanese MT systems are &quot;direct,&quot; and all engage in &quot;global&quot; analysis; most are based on a transfer approach, but a few groups are pursuing the interlingua approach.</Paragraph> <Paragraph position="4"> MT research has been pursued at Kyoto University since 1968. There are now two MT projects at Kyoto (one for near-term application, one for long-term research). The former has developed a practical system for translating English titles of scientific and technical papers into Japanese \[Nagao, 80, 82\], and is working on other applications of English-Japanese \[Tsujii, 82\] as well as Japanese-English \[Nagao, 81\]. The other group at Kyoto is working on an English-Japanese translation system based on formal semantics (Cresswell's simplified version of Montague Grammar \[Nishida et al., 82, 83j). Kyushu University has been the home of HT research since 1955, with projects by Tamachi and Shudo \[74\]. The University of Osaka Prefecture and Fukuoka University also host MT projects.</Paragraph> <Paragraph position="5"> However, most Japanese D~ research (like other research) is performed in the industrial laboratories. Fujitsu \[Sawai et al., 82\], Hitachi, Toshiba \[Amano, 82\], and NEC \[Muraki & Ichiyema, 82\], among others, support large projects generally concentrating on the translation of computer manuals. Nippon Telegraph and Telephone is working on a system to translate scientific and technical articles from Japanese into English and vice versa \[Nemura et al., 82\], and is looking into the future as far as simultaneous machine translation of telephone conversations \[Nemura, personal communication\].</Paragraph> <Paragraph position="6"> The Japanese industrialists are not confining their attention to work at home. Several AI/MT groups in the U.S. (e.g., SRI, U. Texas) have been approached by Japanese companies desiring to fund MT R&D projects. More than that, some U.S. MT vendors (SYSTRAN and Weidner, at least) have recently sold partial interests to Japanese investors. Various Japanese corporations (e.g., NTT and Hitachi) and trade groups (e.g., JEIDA</Paragraph> </Section> <Section position="2" start_page="554" end_page="554" type="sub_section"> <SectionTitle> \[Japan Electronic Industry Development </SectionTitle> <Paragraph position="0"> Association\]) have sent teems to visit MT projects around the world and assess the state of the art.</Paragraph> <Paragraph position="1"> University researchers have been given sabbaticals to work at Western MT centers (e.g., Shudo at Texas, Tsujii at Grenoble). Other representatives have indicated Japan's desire to participate in the CEC's EUROTRA project \[Margaret King, personal communication\]. Japan evidences a long-term, growing commitment to acquire and develop HT technology. The Japanese leadership is convinced that success in MT is vital to their future.</Paragraph> </Section> </Section> <Section position="7" start_page="554" end_page="555" type="metho"> <SectionTitle> METAL </SectionTitle> <Paragraph position="0"> Of the major MT R&D groups around the world, it would appear that the new METAL project at the Linguistics Research Center of the University of Texas is closest to delivering a product. The METAL German-English system passed tests in a production-style setting in late 1982, mid-EJ, and early 1984, and the system has been installed at the sponsor's site in Germany for further testing and final development of a translator interface.</Paragraph> <Paragraph position="1"> The METAL dictionaries are being expanded for maximum possible coverage of selected technical areas in anticipation of production use in 1984.</Paragraph> <Paragraph position="2"> Commercial introduction is also a possibility.</Paragraph> <Paragraph position="3"> Work on other language pairs has begun: English-German is now underwayj and Spanish and Chinese are in the target language design stage.</Paragraph> <Paragraph position="4"> One of the particular strengths of the METAL system is its accommodation of a variety of linguistic theories/strategies. The German analysis component is based on a context-free phrase-structure grammar, augmented by procedures with facilities ford among other things, arbitrary transformations.</Paragraph> <Paragraph position="5"> The English analysis component, on the other hand, employs a modified GPSG approach and makes no use of transformations. Analysis is completely separated from transfer, and the system is multi-lingual in that a given constituent structure analysis can be used for transfer and synthesis into multiple target languages. Experimental translation of English into Chinese (in addition to German) will soon be underway; translation from both English and German into Spanish is expected to begin in the immediate future.</Paragraph> <Paragraph position="6"> The transfer component of METAL includes two transformation packages, one used by transfer grammar rules and the other by transfer dictionary entries; these co-operate during transfer, which is effected during a top-down exploration of the /highest-scoring\] tree produced in the analysis phase. The strategy for the top-down pass is controlled by the linguist who writes the transfer rules; these in turn are paired i-I with the grammar rules used to perform the original analysis, so that there is no need to search through a general transfer gr-m,,-r to find applicable rules (potentially allowing application of the wrong ones). As implied above, structural and lexical transfer are performed in the same pass, so that each may influence the operation of the other; in particular, transfer dictionary entries may specify the syntactic and/or semantic contexts in which they are valid. If no analysis is achieved for a given input, the longest phrases which together span that input are selected for independent transfer and synthesis, so that every input (a sentence, or perhaps a phrase) results in some translation.</Paragraph> <Paragraph position="7"> In addition to producing a translation system per se, the Texas group has developed software packages for text processing (so as to format the output translations like the original input documents), data base management (of dictionary entries and grammar rules), rule validation (to eliminate most errors in dictionary entries and gr-,-m-r rules), dictionary construction (to enhance human efficiency in coding lexical entries)j etc. Aside from the word-processing front-end (being developed by Siemens, the project sponsor), the METAL group is developing a complete system, rather than a basic machine translation engine that leaves much drudgery for its human developers/users. Lehmann et al. \[81\], Bennett \[82\], and Slocum \[83, 84\] present more details about the METAL system.</Paragraph> </Section> class="xml-element"></Paper>