File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/90/c90-3006_concl.xml
Size: 17,171 bytes
Last Modified: 2025-10-06 13:56:26
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-3006"> <Title>Towards Personal MT: general design, dialogue structure, potential role of speech</Title> <Section position="7" start_page="0" end_page="0" type="concl"> <SectionTitle> 2. Interactions concerning syntax, </SectionTitle> <Paragraph position="0"> semantics and pragmatics Until now, the system has worked directly with the text as written by the author. For the remaining types of interaction, it will work on a transcription contained in the shadow record, as well as with some intermediate forms of processing stored in associated records of the shadow file. This forces to lock the original textual field (unless the author decides to change it and accepts to start again from level two).</Paragraph> <Paragraph position="1"> Level five concerns the fixed forms. It is quite usual, especially in technical documentation, that some groups of words take a fixed meaning in certain contexts, with specific, non-compositional translations. For example, &quot;Save as&quot; as a menu item Saue as ... is translated in French as PSnregislrer sous.., and not ~ts &quot;Sauver comme&quot;, which would be correct for other uses. As a menu item, this group functions as a proper noun, not as a verbal phrase. The writer should be asked whether a given occurrence of each such group is to be treated as fixed or not. In the first case, an adequate transcription should be generated in the shadow record C&FXD_Save as&quot;, for example). Certain elements (such as menu items) should be automatically proposed for insertion in the list.</Paragraph> <Paragraph position="2"> Level six concerns lexical clarification. First, polysemies are to be solved by asking the writer. For example, the word &quot;dipl6me&quot; is not ambiguous in French. However, if translating from French into English, 2 possibilities should be given : &quot;diplfme non terminal&quot; (&quot;diploma&quot;) or &quot;dipltme terminal&quot; (&quot;degree&quot;). Some polysemies are source language specific, some depend on the target languages. We want to treat them in a uniform way, by maintaining in the lexical database the collection of all &quot;word senses&quot; Cacceptions&quot;, not really concepts of an ontology as in KBMT-89), linked by disambiguating questions/definitions to the words/terms of the languages supported by the system.</Paragraph> <Paragraph position="3"> Lexical ellipses can also be treated at that level. This problem is particularly annoying in MT. Suppose a text is about a space ship containing a &quot;centrale 61ectrique&quot; (&quot;electric plant&quot;) and a &quot;centrale inertielle&quot; (&quot;illcrtial guidance system&quot;). The complete form is often replaced by the elided one : &quot;centrale&quot;. Although it is vital to 32 3 disambiguate for translating correctly (by the corresponding elided forms: &quot;plant&quot;/&quot;system&quot;), no automatic solution is known. A given occurrence may be an elision or not. If yes, it is even more difficult to look for a candidate to the complete form in a hypertext than in a usual text.</Paragraph> <Paragraph position="4"> At level seven, the unit of translation (the content of the shadow field) has been submitted to a first step of automatic analysis, which returns a surface structure showing ambiguities of bracketing (PP attachment, scope of coordiuation...). The questions to tim writer should not be asked in linguistic terms. The idea is to rephrase the input text itself, that is, to present the alternatives in suggestive ways (on screen, or using speech synthesis- see below).</Paragraph> <Paragraph position="5"> Some other ambiguities, for instance on reference (unresolved anaphora) or syntactic functions (&quot;Which firm manages this office ?&quot; --where is the subject ?) might be detected at this stage. They may be left for the next step to solve (actually, this is a general strategy), or solved interactively at that point. In our view, that would best be done by producing paraphrases \[Zajac 1988\], or by &quot;template resolution&quot; \[6\].</Paragraph> <Paragraph position="6"> At level eight, the disambiguated surface structure has been .;ubnfitted to the deep analysis phase, which returns a multilevel structure (decorated lree encoding several levels of linguistic interpretation, universal as well as language specific). Some ambiguities may appear during this phase, and be coded in the structure, such as ambiguities on semantic relations (deep cases), deep acmalisation (time, aspect...), discourse type (a French infinitive sentence may be an order or not, ik~r example), or theme/theme distinction. Template or paraphrase resolution will be used to disambignate, as no rephrasing of the text can often suffice (e.g. : &quot;the conquest of the Barbarians&quot;).</Paragraph> <Paragraph position="7"> A suggestion of \[6\] was to delay all interactions until transfer. The view taken here is rather to solve as soon as possible all the ambiguities which can not be solved automatically later, or only with much difficulty.</Paragraph> <Paragraph position="8"> For example, word sense disambiguation takes place quite early in the above scheme, and that may give class disambiguation for free.</Paragraph> <Paragraph position="9"> A more flexible scheme would be to ask about word senses eariy only if each lemma of the considered wordlkm~ has more t|:an one acceptkm, if ~mt, the system could wait until after surface a,m.lysis, which reduce~ almost all morphosyntactic ambi;!,t.'.ities. A variation wonkl be to disambigtmte word senses only after sur/ime analysis has been don(;. A prototype should allow ext~e~irnenth~g wilh vaious strategies.</Paragraph> <Paragraph position="10"> ~i\[o ~?~ace and q~.~a~.~y of speeck~ Speect~ synthesis has a pJacc no~ only h~ the translation oJ gpoker~ dialogue';, bni aiso i:i~ iiie transtati(m of written texts. We acma!\[y ~hink ira introdtlcti(m iii Personal MT col~ld bc very holpfill i~ enhancing ergonomy and allowing for more natural disambiguation strategies.</Paragraph> <Paragraph position="11"> 1. Speech synthesis and Personal MT Speech synthesis and MT in general Speech synthesis of translations may be useful for all kinds of MAT. In MT for the watcher, people could access Japanese technical and scientific textual databases, for example, through rough English MT not only over computer networks, as is cm'rently done in Sweden \[10\], but also via the telephone. To produce spoken translations could be even more useful in the case of rapidly changing information (political events, weather bulletins, etc. disseminated to a large public through computer or telephone networks).</Paragraph> <Paragraph position="12"> In the case of professional translation (MAT for the revisor or for the translator), the main area today is the translation of large technical documents. With the advent of wide!y available hypermedia techniques, these documents are starting to contain not only text and images, but also sound, used for instmme to stress some important warning messages.</Paragraph> <Paragraph position="13"> Personal MT could be used for translating technical documents as well as all kinds of written material not relying on creative use of language (ioe. poetry). It could also be used for communication within multilingual teams working together and linked by a network, or by phone. Finally, it could be used for the multilingual dissemination of information created onoline by a monolingual operator (sports events, fairs...) and made accessible in written form (electronic boards, minitel) as well as in spoken form (loudspeakers, radio, telephone), whence the need for speech synthesis.</Paragraph> <Paragraph position="14"> Hence, spoken output does not imply spoken input, and should be considered for all kinds of machine aided translation. As complete linguistic structures of the translations are created during the MT process, speech synthesis should be of better quality than current text-to-speech techniqocs can provide. This does not apply to MAT for the translator, however (although the translator, being a specialist, could perhaps be asked to insert marks concerning prosody, rhythm and pauses, analogous with formatting ma'kups).</Paragraph> <Paragraph position="15"> Speech synthesis of dialogue utterances .l)ia\]ogue tltterancos COflCertt the3 (;ol~lnunicatiol~ between the system and the user, the transtati(m process (reformulatior~, clarification), and the translati(m system ((',g. interrogation or modificatiou of iis lexical database).</Paragraph> <Paragraph position="16"> i~ Telephone lnterpretatio~~ of dialogues, all dialogue utterai~ces ,m.~st obviously be in spoke~ form, ike writmn R~rm being made available only if the pho~c is coupled to a scrce~. I~ translatkm of writtm~ materiai, i~ could be attractive to i:~co~porate speech synthesis i~? the dialogue itself, as an e.nhancemc~t to its visual form, tb.., tilt'; PSallle oQ~;oitolnic loasolL'~ as ahoy(;, anti t)(,X;~lngc spoken alternatives might be intrinsically more suggestive than written ones in order to resolve ambiguities ~ pauses and melody may help to delimit groups and pinpoint their dependencies, while phrasal stress may give useful indications on the theme/rheme division.</Paragraph> <Paragraph position="17"> In the case of non-dialogue-based systems, there are only fixed messages, and on-line speech synthesis is not really necessary, because the acoustic codings can be precomputed. In the case of dialogue-based Machine Translation, however, an important part of the dialogue concerns variable elements, such as the translated texts or the dictionaries, where definitions or disambiguating questions could be inserted.</Paragraph> <Paragraph position="18"> Speech in PMT : synthesis of input, texts or reverse translations Speech synthesis of input seems to be required when producing a document in several languages, with some spoken parts. It would be strange that the source language documentation not have the spoken parts, or that the author be forced to read them aloud. In the latter case, a space problem would also arise, because speech synthesis can produce an acoustic coding (later fed to a voice synthesis chip) much more compact than any representation of the acoustic signal itself.</Paragraph> <Paragraph position="19"> The concept of reverse translation could be very useful in PMT. The idea is to give to the author, who is presumed not to know the target language(s), some control over the translations. In human translation or interpretation, it often happens that the writer or speaker asks &quot;what has been translated&quot;. By analogy, a PMT system should be able to translate in reverse.</Paragraph> <Paragraph position="20"> Technically, it would do so by starting from the deep structure of the target text, and not from the target text itself, in order not to introduce spurious ambiguities (although having both possibilities could possibly help in detecting accidental ambiguities created in the target language).</Paragraph> <Paragraph position="21"> Note that speech synthesis of reverse translations might be ergonomically attractive, even if no spoken form is required for the final results (translations or input texts), because screens tend to become cluttered with too much information, and because reading the screen in detail quickly becomes tiring.</Paragraph> <Paragraph position="22"> 2. The need for very high quality speech synthesis in DBMT It has been surprisingly difficult for researchers in speech synthesis to argue convincingly about the need for very high quality. Current text to speech systems are quite cheap and seem acceptable to laymen. Of course, it is tiring to listen to them for long periods, but in common applications, such as telephone enquiry, interactions are short, or of fixed nature (time-of-day service), in which case synthesis can proceed from prerecorded fragments.</Paragraph> <Paragraph position="23"> DBMT, as envisaged above, seems to offer a context in which very high quality could and should be demanded of speech synthesis.</Paragraph> <Paragraph position="24"> Ergonomy First, the writer/speaker would be in frequent interaction with the system, even if each interaction is short. The overall quality of speech synthesisdepends on three factors : voice synthesis (production of the signal from the acoustic coding) ; linguistic analysis (word class recognition, decomposition into groups), for correct pronunciation of individual words, or contextual treatment (liaisons in French) ; pragmatic analysis (communicative intent : speech act, theme/rheme division...), for pauses, rhythm and prosody.</Paragraph> <Paragraph position="25"> We will consider the first factor to be fixed, and work on the linguistic and pragmatic aspects.</Paragraph> <Paragraph position="26"> Of course, certain parts of the dialogue could be prerecorded, namely the messages concerning the interaction with the system itself. However, users might rather prefer a uniform quality of speech synthesis. In that case, these messages might be stored in the same acoustic coding format as the texts produced under linguistic control.</Paragraph> <Paragraph position="27"> Ambiguity resolution by rephrasing We have seen two main ways of disambiguating structural ambig~tities ha DBMT, namely rephrasing and paraphrasing. Rephrasing means to present the original text in different ways. Suppose we want to disambiguate the famous sentence &quot;tie saw a girl in the park with a telescope&quot; by presenting the alternatives on a screen. We in the park with a telescope the girl in the park with a telescope the girl in the park with a telescope the girl in the park with a telescope the girl in the park with a lelescope ~ 5 34 If the disambiguation happens orally, the spoken forms should be presented in the same register as in the original (here, affirmative), but very clearly distinguished, so that a human could reconstruct the forms above. The availability of complete linguistic structures is necessary, but not sufficient, because understandability is not enough : distinguishability is a new ~Zxluirement for speech synthesis.</Paragraph> <Paragraph position="28"> Other types of linguistic interactions In disambiguation by paraphrasing or template generation (generation of abbreviated paraphrases, as it were), questions should be generated, with their locus clearly indicated by stress and prosody. For instance : Ls&quot; t~e girl or the parIc with a telescope ? In the same manner, speech quality is very important if word sense dismnbiguation is done orally. Since some new words or new senses of existing words may be added by the user, the disambiguation processes should apply to their definitions in the same way as they do to the texts/ulterances to be translated.</Paragraph> <Paragraph position="29"> All pr~eding remarks are of course even more valid in the case of oral input, where speech is the primary means of interaction, and the quality of the signal is mducect by the mmsmission channel.</Paragraph> <Paragraph position="30"> Co~ch~sion The concept of Personal MT crysrtfllizes many kteas from previous systems and research (text--critiquing, interactive MT, dialogue-based MT, Machine Interpretation of spoken dialogues, controlled languages...). However, the perspective of interacting with the author, not requirexl to have any knowledge of the target language(s), linguistics, or translation, puts things in an original framework.</Paragraph> <Paragraph position="31"> While the development of systems of this nature poses old problems in a new way, and offers interesting new possibilities to the developers, their acceptability and usefulness will perhaps result more from their ergonomy than from their intrinsic linguistic quality, how necessary it may be.</Paragraph> <Paragraph position="32"> Promotion of the National Languages is becomi~g q~fite important nowadays, but, apmt of efforts to teach a few for~zigl~ langt~ages, no technical .~;olul:io~~ ires y(:t been proposed to help people write i~a mei~ ow~ lauguage and communicate with oilier people in ihcir own lang~ages. Personal MT could bc such a solution.</Paragraph> <Paragraph position="33"> Wc strongly hope that many researcher'; wilt ~.~.ac interest i~ this new field el MT.</Paragraph> <Paragraph position="34"> Although speech s'ynthcsi,a of the ii~pui or outpt~t texts had been considereci fin ihc mili~i dcsigr~ o_~ the project, and thought to be ~tsefu~ i~i ,~)li~oc paris, i~ was J.I. Tsujii who pointed to me how interesting it would be to use it in ambiguity resolution, provided we can reach the necessary quality. I am also grateful to J.Ph. Guilbaud, E. Blanc, and M. Embar for reviewing earlier drafts of this paper. While their help was very valuable for improving both content and form, the remaining deficiencies am of course mine.</Paragraph> </Section> class="xml-element"></Paper>