File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/ackno/88/c88-1067_ackno.xml
Size: 13,302 bytes
Last Modified: 2025-10-06 13:51:35
<?xml version="1.0" standalone="yes"?> <Paper uid="C88-1067"> <Title>TRADITIONAL MEANS IN MACHINE TRANSLATION</Title> <Section position="2" start_page="0" end_page="329" type="ackno"> <SectionTitle> Abstract: </SectionTitle> <Paragraph position="0"> The chronic problems of machine translation cannot be solved in a fully automatic way. Human intervention is inevitable.</Paragraph> <Paragraph position="1"> The development of &quot;traditional&quot; means in connexion with advances of computer technology represent most substantial contribution to further progress in the field of machine translation. Some of the problems are illustrated using the example of the APAC32 proJect. null I. The hopes for a successful solution of the chronic problems of machine translation (MT) have long been set on two fruitful and mutually dependent prospects: research in artificial intelligence (AI) and advances in the computing technology. The importance of the latter contribution is beyond dispute. As regards the former domain, some reservations must be voiced.</Paragraph> <Paragraph position="2"> 1.1~ It can be stated with some tolerance that the missing information required for automatic understanding (or desambiguation) of natural language (NL) is supposed to be supplied by a computer model of the knowledge correspomding to the universe of discourse. The context of the analysed message constitutes an important part of this universe. Therefore, an essential component of such a model must draw on the texts processed.</Paragraph> <Paragraph position="3"> Thus, irrespective of the contingent form, organisation, etc., of the whole, the model would at least partially depend on the results of the analysis for which it is supposed to provide necessary information. This means that circularity is imminent. Even if the almost inevitable occurrence of elements not covered by any device in the system is disregarded, it is obvious that the model can be neither complete nor consistent.</Paragraph> <Paragraph position="4"> Since there will always remain threats of failure caused not by accidental factors but by the intrinsic inadequa1.2. null 2.</Paragraph> <Paragraph position="5"> 2.1.</Paragraph> <Paragraph position="6"> cy of any system of MT, human intervention is inevitable, and the ideal of &quot;fully automatic high quality trans~ lation&quot; (FAHQT) (which, we suspect, is no longer believed to be able to ever come true, anyway) is impossibledeg While not denying potential merits of the contribution of AI, the above discussion should suggest that the develop~ ment of means called &quot;traditional&quot; is equally important for MT. An example of an approach based on such means is our experimental system APAC32.</Paragraph> <Paragraph position="7"> If we refer to our system here, it is not to boast that we have achieved any extraordinary success or that we have long duly appraised the above conclusions and reacted on them in an original way, etc.</Paragraph> <Paragraph position="8"> It is only to illustrate our conviction that there is still a fairly wide and long path open ahead of us within the confines of the traditional means. To say the truth, it has been our material situation that forced us to rely exclusively on them and to dispense with anything more sophisticated. This had to be said to clear us of a suspicion that we are making a virtue of necessity.</Paragraph> <Paragraph position="9"> APAC32 is a descendant of the Montreal TAUM series. It has been implemented on computers of the type of IBM 370 to translate into Czech English abstracts in microelectronics and, later, pumping machinery. Using Colmerauer~s Q-systems the main part of the program builds linearized rooted-tree-llke structures, which stepwise identify and interpret elements or groups of elements of the input units stating their character and function, dependency relations and position in the sentential context.</Paragraph> <Paragraph position="10"> Strings with multiple interpretations which had not been eliminated are rs~ presented by parallel structures giving multiple parses in the final stage of the analysis, but not necessarily mul~ tiple translations at the final output.</Paragraph> <Paragraph position="11"> Bas~o or fully accomplished structures~ which resemble predicate calculus pat-deg te~Ds, have a finite verb at their root and individual participants in depen~ dent pesitions~ The sense (direction to the left or right) of an oriented edge (an arrow) representing a dependency relation ~ an information perraining to the mutual projective position of the incident nodes - as well as the function of a dependent participant are ~Lndicated in a way that simulates the marking of edges in a graph. The synthesis starts by disintegrating the stru~tures that result from the analy~ sis~ At the output of this stage, re~ latlvely simple trees representing indivldual words appear, with all the infer~atlon necessary for generating form~ of the target language. This proeeed~f in steps in which occasionally additional target-language-specific information has to be derived to render the synthesized structures complete and acceptable. Such adjustments are usually connected with the operations of transfer: while the action of its general rules mostly coincides with the opening phases of the synthesis~ the ~nformation concerning the parti~ cular changes is contained in the dic~ tionariss to be exploited in the concluding parts of the program.</Paragraph> <Paragraph position="12"> The absence of any accomplished model of the universe of discourse and the temporary abandonment (for technical reasons) of any device alowing the involvement of hypersentential context in the analysis have, of course, endowed the system with a typical probabilistic character. In this connexion, especially the tactics occasionally referred to as &quot;preferential&quot; must be mentioned: some rules are applied repeatedly in subsequent stages, each time with conditions less rigid. The combinatorial power of the Q-systems had to be reduced by introducing several stages - partial grammars - operat~ ing before the syntactic analysis propete Thus, e.g., a (partial) analysis of nor, linal complexes precedes that of verbal structures. Therefore, a special device registers schematically the context (,f each element in the sentence.</Paragraph> <Paragraph position="13"> ing the source language and that pe~=~ raining to the target languagedeg These structures can be separated~ ~hey have been put together whenever pos~ sible with respect to the efficiency of the systemdeg The internal structure of both these parts is almost the same and can be briefly described as follows: ca~egorial information~ le~, xical value~,pa~adigmatic information~ pointers to parallelmeanings, valen~ oy frame, combinatory frame (preposi~ tional, phrasal9 special~liaison, etc., patterns), terminological spe~ cifications, special syntactic inforo~ mation, semantic features~ Extensive though this apparatus may be, it should be stated that theze are still possibilities ~ and a need~ of course to add further datadeg For lack of space, let us confine ourselves to three poi~ts only(r) 2~3.1~I. The apparatus of semantic featu~'e~ consists of four &lasses of feaotures: a) features concerning the text vs. metatext structure~ b) general semantic features~ c) domain specific features, and d) features concerni~ terminologi~cal statusdeg The number of features is limited for reasons of which the most important is that excessive detailedness leads to unwanted ~i~ gidity. However, a number of poten ....</Paragraph> <Paragraph position="14"> tially very useful candidate featu~ res can be added. Assigning weights to features might be a solution to this dilemma, especially in the framework of the &quot;preferential&quot; tactics.</Paragraph> <Paragraph position="15"> 2o3oi .2. Some classes of words have been further classified to highlight their intrinsic properties in thG translation environmentdeg E~g., a special classification of verbs makes it possible to solve, at least in part, the problems of as.~ pest in Czech in relation to Eng~</Paragraph> <Paragraph position="17"> fish verbal adjectives (-ED, -ING forms). Much more can be done in this direction. Unfortunately, this will imply extensive empirical work including excerption and, if possible, organization of a usage-panellike inquiry.</Paragraph> <Paragraph position="18"> As concerns combinatory frames, also more information will be added on the possibilities of adverbial modification of nouns. Some changes and additions to the present organisation and contents of the dictionaryentries ar e considered with a view to structures suggested in the Mel%huk-Apresyan's model &quot;meaning - text&quot;.</Paragraph> <Paragraph position="19"> A specific dictionary device has been introduced in the terminological section of the dictionary system. Special rules control, or rather, guide the analysis of terminological complexes, making it possible to de- null In this way partial quasi-model of the specific domain can be formed whose elements are capable o~ recursive application to new combinations. Another dictionary device deals wlth unrecognized elements - the so-called transducing dictionary (TD). TD relies on derivational morphology, assigning categorial information, and, in some cases, semantic status and other information to words hitherto &quot;unknown&quot; to the system, on the basis of their endings (e.g., -ING, -ED, -ESS, -ITY, -ION, -LY, -WISE, -PY, etc.); for some of them even successful adaptation to the target language is possible. The remaining unrecognlzed.elements are regarded as nouns: first as proper, then, if this fails to be confirmed, common. A more versatile practice is planned, which will take into consideration other possible interpretations as well.</Paragraph> <Paragraph position="20"> TD, as well as some other devlces and rules can be also regarded as special fail-soft measures, though another component called &quot;emergency rules&quot; is 2.4.</Paragraph> <Paragraph position="21"> 2.5.</Paragraph> <Paragraph position="22"> included which performs this f~Lucti0~ as a specialized set of rules design~ ed to reconstruct, complete or integ~ rate into a (would~be_.)_.me_animgf.u! whole those structures that failed %0 reach the stage of an accomplished parse. In some respects, the role oPS such measures is problematic in zela~ tion to h~an intervention. Our sys~ tem offers possibilities to introduce a special diagnostic device to recog~ nize and classify the s~mptoms of a failure, so that more than the present simple marking of &quot;suspicious&quot; or &quot;underdone&quot; outputs can be presen= ted to aid the postedition.</Paragraph> <Paragraph position="23"> Ambiguities are treated in the usual way. It should be pointed out that in the translation between the languages in question, the principles of agreement so widely applied in Czech unmer~ cifully reduce the chances to get over some types of unsolved ambiguities in an &quot;unperceptible&quot;~ i.e., accidental, way. These principles, as a rule, obstinately insist upon rendering impli~ cit information explicit. That is why in some cases structures with ambiguous reference are translated by equivalents equally ambiguous or vague. E.g., with some classes of verbs, (clausal) parti~ cipial modification with ambiguous de~ pendence is replaced by prepositional or other constructions without any di~ rect dependence: e.g., USING -, WITH USING, CAUSING -~ WHICH (referring to the whole of the preceding or pertinent clause) CAUSES, etc.</Paragraph> <Paragraph position="24"> This concerns also contrastive ambigui~ ties and other asymmetrical relations between the two languages. In this con .... nexion, it should be pointed out that one of the criteria for the classifica~ tion of English verbs is the classific~ tion of their Czech counterparts. Th,~s~ e.g., the verb SUPPOSE must be assigned information that the construction SOME~ ONE IS SUPPOSED TO... must be transformed to IT IS SUPPOSED (ABOUT SOMEONE) THAT SOMEONE... to make it correspond to the structure acceptable in Czech.</Paragraph> <Paragraph position="25"> Similarly, constructions like SEAT SAT ON BY,.. must be transformed with the aid of correspondi~ relative clauses~ Much remains to be done for the domain of conversion. Its productive aspects po~e serious problems.</Paragraph> <Paragraph position="26"> 3o To come back to the opening paragraphs: the ~dvances of computer technology, while not offering ultimate solution of problems detrimental to the efforts to achieve the ideals of PAHQT, will undoubtably liberate the MT from the curse entailed by its usually more or less immediate subservience to various practical applications - the strict limitations of computer time and storage - which so often represented the only obstacles in introdu,~ing many a useful and, sometimes, even very necessary device, process or approach. Most of the prospective extensions, innovations and other changes require profound empirical examination and more linguistic fleld-work than, up to now, we were able to expends</Paragraph> </Section> class="xml-element"></Paper>