File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/c92-3139_intro.xml
Size: 3,134 bytes
Last Modified: 2025-10-06 14:05:12
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-3139"> <Title>A Statistical Approach to Machine Aided Translation of Terminology Banks</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> Existing machine translators work well for limited domains (Slocum, 1985). Wlmn an MT system is transported to another domain, among other things, the domain specific terms have to be acquired and translated before the system can do any reasonable work again (Knowles, 1982). Current ways of handling this porting process are largely manual. Usually one either gleans domain specific tenns from large amount of document at once and translates them one by one by hand, or translated each unkalown term when it appears.</Paragraph> <Paragraph position="1"> lliese previous approaches all involve large amount of effort of more than one person. The long and tedious process may often result in inconsistent translation.</Paragraph> <Paragraph position="2"> Furthermore, no dictiolmry is complete, but still we hope tlmt the translation system produces some trunslation when encountering an unknown word.</Paragraph> <Paragraph position="3"> However, U'anslation of terms on a one-for-one basis Ires no closure. When eneounteruig an unknown term, however similar to a known one, the system will not be able to fall softly and produce some kind of reasonably acceptable translation like a human translator does.</Paragraph> <Paragraph position="4"> Similar consideration motives a text-to-speech research on producing pronunciation for an mflulown words through morphological decomposition (Black et al.</Paragraph> <Paragraph position="5"> 1991).</Paragraph> <Paragraph position="6"> This paper reports on a project experimenting on a new approach to this problem. The project involves statistical lexical acquisition from a large corpus of document to build a terminology bank, and automatic extraction of roots from tile tenuinology bank. The idea is to perform htmlan translation of these roots and to translate a term by composing the translation of its constituent roots. This idea is similar to the rootoriented dictiotmry proposed ill (Tufts and Popescu, 1991). Certain mnoant of postedithlg is expected.</Paragraph> <Paragraph position="7"> However, over all, we expect this method to save significant mnom~t of human effort, produce more consistent translatioa, and resolt in better closure such that the system can fall gracefully whan encountering an unknown word.</Paragraph> <Paragraph position="8"> &quot;lhe rest of the paper will tocns on the acquisition of roots from a terminology bank. Section 2 states fonnally the problem. Section 3 describes our approach AcrPSs DE COLING-92, NANTES. 23-28 AOtrr 1992 9 2 1 PROC. OF COLlNG-92, NANTES, AUG. 23-28, 1992 to root acquisition. Section 4 describes the setup of our experiments and reports some preliminary results.</Paragraph> <Paragraph position="9"> Section 5 concludes the paper with some remarks and points out directions for future research.</Paragraph> </Section> class="xml-element"></Paper>