File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/84/p84-1051_metho.xml
Size: 6,584 bytes
Last Modified: 2025-10-06 14:11:38
<?xml version="1.0" standalone="yes"?> <Paper uid="P84-1051"> <Title>MACHINE TRANSLATION : WHAT TYPE OF POST-EDITING ON WHAT TYPE OF DOCUMENTS FOR WHAT TYPE OF USERS</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> MACHINE TRANSLATION : WHAT TYPE OF POST-EDITING ON WHAT TYPE OF DOCUMENTS FOR WHAT TYPE OF USERS Anne-Marie LAURIAN </SectionTitle> <Paragraph position="0"> Centre National de la Recherche Scientifique Universitd de la Sorbonne Nouvelle - Paris III 19 rue des 8ernardins, 75005 Paris (France)</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> ABSTRACT </SectionTitle> <Paragraph position="0"> Various typologies of technical and seientifical texts nave already been proposeO bv authors involved in multilingual transfer problems. They were usually aimed at a better knowledge of the criteria for deciding if a document has to be or can be machine translated. Such a typology could also lead to a better knowledge of the typical errors occuring, and so lead to more appropriate post-editing, as well as to improvements in the system.</Paragraph> <Paragraph position="1"> Raw translations being usable, as they are quite often for rapid information needs, it is important to draw the limits between a style adequate for rapid information, and an elegant, high qualitv style such as required for information large dissemination. Style could be given a new definition through a linguistic analysis based on machine translation, on communication situations and on the users' requirements and satisfaction.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> I. MACHINE TRANSLATION AND POST-EDITING, A EUROPEAN EXAMPLE </SectionTitle> <Paragraph position="0"> Machine translation is often considered as a project, an experimental process, if not an impossible dream. Translation theoreticlans would sav no machine can understand the meaning of a text and re-express it in an other language, so no machine can translate.</Paragraph> <Paragraph position="1"> The debate is about the necessity of a deep semantic understanding for translating, opposed to a language structure knowledge to be sufficient to produce a translation. The usual debate is thus about the ideal concept each one has of what a translation should be.</Paragraph> <Paragraph position="2"> Translation can only be defined in particular situations, regarding particular documents. And machine translation is only to be used for certain types of documents to be handled a certain way.</Paragraph> <Paragraph position="3"> HY observations are based on several studies I carried out on the SYSTRAN output produced in Luxembourg within the Commission of the European Communities.</Paragraph> <Paragraph position="4"> In Luxembourg the amount of documents to be translated is not only very big, it is also growing very fast. The european rule is that all official documents have to be translated into the seven official languages; technical documents needed for conferences or experts meetings are sometimes translated only in three or four languages (english, french, german, italian). The delay available is often very short. That led the C.E.C.</Paragraph> <Paragraph position="5"> General Direction for Multilingual Transfers to promote machine translation. When they started it, some six years ago, SYSTRAN was the only system ready to produce translations. This system, originated in the U.S., has then been developed for the proper use of the Commission.</Paragraph> <Paragraph position="6"> The output was far from being perfect, far from being usable as it was. Post-editing was being done. Even with the huge progress of the output quality, post-editing is still necessarY. It will, in fact, be always necessary because as people get used to their translation to be done by a computer, their requirements are becoming more precise. The errors one would admit at an experimental stage, are no more possible at a productive stage.</Paragraph> <Paragraph position="7"> Post-editing is thus becoming a new specialization within the numerous fields related to translation.</Paragraph> <Paragraph position="8"> I; - A TYPOLOGY OF DOCUMENTS</Paragraph> </Section> <Section position="4" start_page="0" end_page="236" type="metho"> <SectionTitle> BASED ON M.T. ERRORS </SectionTitle> <Paragraph position="0"> All documents are not suitable for machine translation. Lots of negative reactions against M.T. have been induced by a wrong use of M.T. Aware of the necessity of differentiating the documents, people responsible for translation proposed several types of typologies. They were mainly based on the subject field of the text, on its function, on its structure, on the sentence and paragraph length and complexity, on the use of particular terminologies.</Paragraph> <Paragraph position="1"> The aim was to enable the chief of a translation division to choose which texts were to be sent to a human translator, and which could be processed by M.T.</Paragraph> <Paragraph position="2"> My study of the errors remaining in the raw translations led me to propose a strictly linguistic typology. I There are three major tvpes of errors : i. errors on isolated words, 2. errors on the expression of relations, 3. errors on the structure and on the information display.</Paragraph> <Paragraph position="3"> These errors are classified in three tables : i.i vocabulary, terminology 1.2 proper names and abbreviations, 1.3 relators : - in nominai groups, - in verbai groups, 1.4 noun determinants, verbal modificators ; 2.5 verb forms (tense), 2.6 verb forms (passive/active) and personalization (passive/non personai), 2.7 expression of modaIity or not, 2.8 negation ; 3.9 logical relations, phrase introducers, \].10 words order, 3.11 general problems of incidence.</Paragraph> <Paragraph position="4"> The relative frequence of these errors can be read in my tables.</Paragraph> <Paragraph position="5"> These tables can be used to evaluate the probable quantity and location of errors existing after M.T., i.e. the probable quantity, location and type of post-editing. With a short training in linguistics, anyone could get trained to use these tables. By a rapid reading of the documents to be translated on the basis of these features, and according to the relative frequence of one category of probable errors or the other, one could then easily evaluate if a document should be translated by a translator or is suitable for M.T.</Paragraph> </Section> class="xml-element"></Paper>