File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/82/c82-1034_metho.xml
Size: 17,025 bytes
Last Modified: 2025-10-06 14:11:23
<?xml version="1.0" standalone="yes"?> <Paper uid="C82-1034"> <Title>MULTI-LEVEL TRANSLATION AIDS IN A DISTRIBUTED SYSTEM</Title> <Section position="3" start_page="215" end_page="215" type="metho"> <SectionTitle> 216 A.K. MELBY </SectionTitle> <Paragraph position="0"> terminology aids to full machine transtation. All three levels are fully integrated and the translator can quickly switch from one level to another even within the translation of a single sentence. This means that the translation process can continue smoothly regardless of how many sentences fail to receive a full analysis and a good machine translation. This in turn means that the actual machine translation component can be &quot;pure&quot; in the sense that no compromises need be made to ensure some kind of output even on sentences that are not analyzable v~ith the current parser and model of language.</Paragraph> <Paragraph position="1"> It is hoped that the above design will solve the three problems under discussion.</Paragraph> <Paragraph position="2"> Placing the translators in control of the operation of the system should improve their attitude. Using multiple levels of aid should overcome the dangers of the &quot;all or nothing&quot; approach. And replacing conventional terminals with microcomputers should overcome some of the problems of centralized processing. Solving these user-oriented problems is important from a theoretical viewpoint because even a research translation system desperately needs user feedback from real translators. And real translators will not give the needed feedback unless the system is practical and user-friendly.</Paragraph> <Paragraph position="3"> The rest of the paper will elaborate on each of the three problems and their proposed solution in the new version of ITS.</Paragraph> </Section> <Section position="4" start_page="215" end_page="215" type="metho"> <SectionTitle> PROBLEM ONE: HUMAN FACTORS </SectionTitle> <Paragraph position="0"> Lacking FAHQT, human translators and revisors are still needed in a computerized translation system. In ITS version one, translating a text involved asking questions about each sentence of the text before the translation of the first sentence appeared.</Paragraph> <Paragraph position="1"> When the translated sentences finally did appear, the translator/revisor was expected to examine and then revise them as needed but not to retranslate them from the source text. After all, this was a human-assisted MACHINE translation system and we had already invested considerable interaction time and machine time in the translation of each sentence. The translator/revisor was to remove the errors from the machine's translation and no more. Understandably, the human translator/revisor often felt more like a &quot;garbage collector&quot; than a translator.</Paragraph> <Paragraph position="2"> Having an unhappy translator is a serious problem. It should be remedied, if possible, for two reasons: (I) We should be concerned for the translator as a person. (2) An unhappy translator will fight the system. Consider the following statement by a human translator: During my years with JPRS . . . I had occasion to do some post-editing of machine translations, in addition to my normal assignments .... Monetary considerations aside, the work was odious. To post-edit, a conscientious translator had to literally retranslate every sentence in the original, compare it word for word with the clumsy machine attempts, and' then laboriously print in corrections between the lines of the printout. It would have been much faster--and less tedious--just to translate &quot;from scratch&quot; and dictate the translation on tape, as I normally do.</Paragraph> <Paragraph position="3"> And I am sure the product would have been better. It was thus my impression that post-editing of machine translations is translation work at coolie wages. I can't imagine anyone wanting to do it unless the alternative was starvation. (Silverstein,1981) Seppanen (1979) claims that relatively little attention has been paid to the pragmatic aspects of man/machine dialogues. He claims that human factors in man/machine interfaces have not attracted the interest of either computer scientists or psychologists. Perhaps, then, human factors in computerized translation systems are an appropriate area of interest for computational linguists, and this view seems to be</Paragraph> </Section> <Section position="5" start_page="215" end_page="215" type="metho"> <SectionTitle> MULTI-LEVEL TRANSLATION AIDS IN A DISTRIBUTED SYSTEM 217 </SectionTitle> <Paragraph position="0"> gaining momentum from within the field. Researchers at the Grenoble project have concluded: The human and social aspects should not be neglected. To force a rigid system on revisors and translators is a guarantee of failure. It must be realized that AT (Automatized Translation) can only be introduced step by step into some preexisting organizational structure. The translators and revisors of the EC did not only reject Systran because of its poor quality but also because they felt themselves becoming &quot;slaves of the machine&quot;, and condemned to a repetitive and frustrating kind of work.</Paragraph> <Paragraph position="1"> (Boitet et ai,1980) Our answer to the problem of human factors is to place the translator in control. The translator uses human judgment to decide when to post-edit and when to translate. Nothing is forced upon the translator. This approach is strongly argued for by Kay (1980) when he states: &quot;The kind of translation device I am proposing will always be under the tight control of a human translator&quot;. And Lippman (1977) describes a successful terminology aids experiment in Mannheim and concludes: &quot;The fact that quality was improved, rather than degraded as in the ease of MT, appears to support the soundness of an approach where the translator retains full control of the translation process.&quot; PROBLEM TWO: THE &quot;ALL OR NOTHING s' SYNDROME Originally, FAHQT was the only goal of research in machine translation. Until recently, there seemed to be a widely shared assumption that the only excuse for the inclusion of a human translator in a machine translation system was as a temporary, unwanted appendage to be eliminated as soon as research progressed a little further. This &quot;all or nothing&quot; syndrome drove early machine translation researchers to aim for FAHQT or nothing at all. It is now quite respectable in computational linguistics to develop a computer system which is a TOOL used by a human expert to access information helpful in arriving at a diagnosis or other conclusion. Perhaps, then, it is time to entertain the possibility that it is also respectable to develop a machine '-anslation system which includes sophisticated linguistic processing yet is designed to used as a tool for the human translator.</Paragraph> <Paragraph position="2"> h you expect each sentence of the final translation to be a straight machine translation or at worst a slight revision of a machine translated sentence, then you are setting yourself up for a fall. Remember Brinkmann's conclusion that &quot;the post-editing effort required to provide texts having a correctness rate of 75 or even 80 percent with the corrections necessary to reach an acceptable standard of quality is unjustifiable as far as expenditure of money and manpower is concerned&quot; (Brinkmann,1980). Thus, a strict post-edit approach must be nearly perfect or it is almost useless. Many projects start out with high goals, assuming that post-editing can surely rescue them if their original goals are not achieved. Even post-editing may not make the system viable.</Paragraph> <Paragraph position="3"> The proposed solution to this problem is to anticipate from the beginning that not every sentence of every text will be translated by computer and find its way to the target text with little or no revision. Then an effort can be made from the beginning to provide for a smooth integration of human and machine translations. ITS version two will have three integrated levels of aid under the control of the translator. We will now describe the three levels of translator aids.</Paragraph> <Paragraph position="4"> Level one translator aids can be used immediately even without the source text being in machine-readable form. In other words, the translator can sit down with a source text on paper and begin translating much as if at a typewriter. Level one includes a text processor with integrated terminology aids. For familiar terms that recur there 218 A.K. MELBY is a monolingual expansion code table which allows the user to insert user-defined abbreviations in the text and let the machine expand them. This feature is akin to the &quot;macro&quot; capability on sortie word processors. The key can be several characters long instead of a single control character, so the number of expansion codes available is limited principally by the desire of the translator. Level one also provides access to a bilingual terminology data bank. There is a term file in the microcomputer itself under the control of the individual translator. The translator also has access to a larger, shared term bank (through telecommunications or local network). Level one is similar to a translator aid being developed by Leland Wright, chairman of the Terminology Committee of the American Translator's Association. Ideally, the translator would also have access to a data base of texts (both original and translated) which may be useful as research tools.</Paragraph> <Paragraph position="5"> Level two translator aids require the source text to be in machine-readable form.</Paragraph> <Paragraph position="6"> Ineluded in level two are utilities to process the source text according to the desires of the translator. For example, the translator may ran aceross an unusual term and request a list of all occurrences of that term in that text. Level two also includes a &quot;suggestion box&quot; option (Melby,1981) which the translator can invoke. This feature causes each word of the current text segment to be automatieally looked up in the term file and displays any matches in a field of the screen called the suggestion box. If the translator opts to use the suggested translation of a term, a keystroke or two will insert it into the text at the point specified by the translator. If the translator desires, a morphological routine can be activated to inflect the term according to evidence available in the source and target segments.</Paragraph> <Paragraph position="7"> Level three translator aids integrate the translator work station with a full-blown MT system. The MT component can be any machine translation system that includes a self-evaluation metric. The system uses that metric to asssign to each of the translated sentences a quality rating (e.g. &quot;A&quot; means probable human quality, &quot;B&quot; means some uncertainty about parsing or semantic choices made, &quot;C&quot; means probable flaw, and &quot;D&quot; is severely deficient). On any segment, the translator may request to see the machine translation of that segment. If it looks good, the translator can pull it down into the work area, revise it as needed, and thus incorporate it into the translation being produced by the translator. Or the translator may request to see only those sentences that have a rating above a specified threshold (e.g. above &quot;C&quot;). Of course, the translator is NEVER obliged to use the machine translation unless the translator feels it is more efficient to use it than to translate manually. No pressure is needed other than the pressure to produce rapid, high--quality translations. If using the machine translations make the translation process go faster and better, then the translator will naturally use them.</Paragraph> <Paragraph position="8"> The successful METEO system by TAUM (Montreal) expresses the essence of this approach. All sentences go into the MT system. The system evaluates its own output and accepts about 80 percent of the sentences. Those sentences are used without post--editing. The other 20 percent are translated by a human and integrated into the machine-translated sentences. This application differs from ours in that human translators do not see any machine translations at all--goed or bad. But the basic level three approach is there.</Paragraph> <Paragraph position="9"> One positive aspect of this three level approach is that while level three is dramatically more complex linguistically and computationaliy than level two, level three appears to the translator to be very similar to level two. Level two presents key terms in the sentence; level three presents whole sentences. When good level three segments are available, it can speed up the translation considerably but their absence does not stop the translation process. Thus, a multi-level system can be put into production much sooner than a conventional post-edit system. And the sooner a system is put into production, the sooner useful feedback is obtained from the users.</Paragraph> </Section> <Section position="6" start_page="215" end_page="215" type="metho"> <SectionTitle> MULTI-LEVEL TRANSLATION AIDS IN A DISTRIBUTED SYSTEM 219 </SectionTitle> <Paragraph position="0"> The multi-level approach is designed to please (a) the sponsors (because the system .is useful early in the project and becomes more useful with time), (b) the users (because they are in control and choose the level of aid), and (c) the linguists and programmers (because they are not pressured to make compromises just to get automatic translation on every sentence).</Paragraph> <Paragraph position="1"> PROBLEM THREE: TRADITIONAL CENTRALIZED PROCESSING Machine translation began in the 1950's when the cost of a CPU prohibited the thought of distributed processing in which each user has a personal CPU. Interactive time-shared computing (where each user has a dumb terminal connected to a shared CPU) can give the impression that each user has a personal computer--so long as the system is not loaded down. Unfortunately, systems tend to get loaded down. Highly interactive work such as word processing is not suited to an environment where keystroke response times vary. Also, centralized processing requires either physical proximity to the main CPU or telecommunications lines. High speed telecommunications can be vary costly, and low speed telecommunications are not user-fr!endly. A costly solution is to obtain a dedicated mainframe and never load it down. A more cost-effective solution in terms of today's computer systems is a distributed system in which each translator has a microcomputer tied into a loose network to share resources such as large dictionaries.</Paragraph> <Paragraph position="2"> The individual translator work station would be a microcomputer with approximately 256K of main memory, dual diskette drives, CRT, keyboard, small printer, and communications port. Such systems are available at relatively low cost (under 5 000 U.S. dollars). Additional storage for term files and text files can be obtained at reasonable cost by adding a Winchester-type disk. If several translators are in the same building, a local network can be set up to share terminology and document data bases and even inter-translator messages. The capabilities of the work station would include rapid, responsive word processing and access to internal dictionaries and to shared translator data bases (i.e. level one and level two processing). The internal dictionaries would include an expansion file and a terminology file under the control of the translator. Of course, the translator could load internal files appropriate to t'~. subject matter of the document by inserting the appropriate diskettes. Access to s, .rce texts, document-specific dictionaries, and level three machine translations c~ I be granted through a local network, a telecommunications network, or through the mails on diskette. Ideally, part of the machine translation would be done on the translator work station in order to allow the translator to repair level three dictionary problems before they cause rep.eated errors throughout a text. A minimal capability m the work statlon would be a translator defined replacement table to correct some improper word choices that cause repeated errors in the machine translated sentences. Ultimately, microcomputers will be powerful enough to allow source text to be presented to a work station which contains full level three software. In the meantime, the raw machine translation part of level three can be done remotely on any suitable mainframe and then transmitted to a microcomputer translator work station for integration into the translation process as level three aids.</Paragraph> </Section> class="xml-element"></Paper>