File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/88/c88-2160_intro.xml
Size: 6,552 bytes
Last Modified: 2025-10-06 14:04:43
<?xml version="1.0" standalone="yes"?> <Paper uid="C88-2160"> <Title>Interactive Translation : a new approach</Title> <Section position="3" start_page="0" end_page="785" type="intro"> <SectionTitle> A. THE PROBLEM Goals </SectionTitle> <Paragraph position="0"> The main goal here is to resolve correctly ambiguities arising in natural language analysis in every case. To date, this cannot be aecomplisheA by any existing automatic MT system. The problem remains choosing a sentence structure that most accurately reflects the author's intended message and it therefore remains an unsolved and yet important problem.</Paragraph> <Paragraph position="1"> Classical machine translation systems use heuristics based on statistical regularities in the use of language. Interactive systems ask questions directed at a specialist of the system (like rFS of BYU \[Melby & alii 80\]) and/or a specialist of the domain (like the TITUS system of Institut Textile de France \[Ducrot 82\]). There, tile interaction is done purely at the syntactic level, as a syntax directed editor for a programming language is used by a specialist of both the system and the language 1.</Paragraph> <Paragraph position="2"> Models or projects using extralinguistic knowledge will not be able to solve ambiguities in every case: a document is generally supposed to provide some piece of new information that may not be coded in the knowledge base.</Paragraph> <Paragraph position="3"> The use of learning procedures is at present not effective.</Paragraph> <Paragraph position="4"> None of these approaches can resolve ambiguities correctly in every case. The problem is basically a matter of interpretation: only the author of the document himself can tell what he intended to say. Nevertheless, he is not supposed to have any knowledge of the target !language and therefore, he should not be involved during the transfer phase 2.</Paragraph> <Paragraph position="5"> In the case of interaction with the author, two problems arise: ' 1. The author is supposed to write his document and not to solve weird linguistic problems.</Paragraph> <Paragraph position="6"> ~2. In all interactive systems, the system asks a specialist questions based on knowledge of the underlying linguistic theory. For interacting wiUa the author, this approach is to be rejected: see examples of interaction with ITS \[Melby & alii 80\] or even Tomita's system \[Tomita 84\].</Paragraph> <Paragraph position="7"> A proposal To solve these problems, we propose : - to integrate the interactive system as one function of a word processor, the interaction being initiated by the author; - to explain an ambiguity presenting a set of paraphrases generated from the set of parse trees of the ambiguous sentence; - to explain an error (of spelling and of grammar) by presenting a &quot;reasonable&quot; correction and a comment of the error. This point will not be treated in this paper. See for example \[Jensen & Heidorn 83, Zajac 86b\].</Paragraph> <Paragraph position="8"> The integration in a word processor allows the use of a &quot;controlled language&quot; where checking and correction is done during the creation or modification of a document. This can be viewed as an extension of the capabilities of a simple spellchecker, in the form a toolbox of linguistic aids for the author, checking the spelling, the terminology, the grammar and the style. For the translation of technical material, the use of a normative grammar, imposing precise limitations on terminology and syntax, will entail more clarity and concision in expression, as argued by \[Elliston 79\] and \[Ruffino 82\], and will offer a convenient tool for normalizing a documentation.</Paragraph> <Paragraph position="9"> In the cases where a correct interpretation uses domain knowledge interactively, it will be possible to make a clear cut between the pure linguistic knowledge, to be coded in the analyser, and the extralinguistic knowledge (semantics of the domain). As a matter of fact, it is not always justified to integrate in the grammar specific semantic categories, as in the METEO system for example. This separation will allow us to enlarge the domain of applicability of a machine translation system, that could be, for example, extended to a personal translation system \[Tomita 84\], and this could be interesting when no translation service is available or if the quantity of translation does not justify using the services of a translator \[Kay 82\].</Paragraph> <Paragraph position="10"> GETA \[Vauquois 78\]. There are four main levels of linguistic interpretation: 1. categories : morphosyntactic categories (gender, number, class of verb,...), semantic categories (abstract, concrete,...), actualisation categories (perfective, imperfective,...), syntactic categories (noun, verb, valencies,...) and syntactic classes (sentence, verb phrase,...). 2. syntactic functions : subject, objectl, object2, attribute of the subject, attribute of the object, complement of noun or adjective, detemainer, circumstancial complement ....</Paragraph> <Paragraph position="11"> 3. logical relations : predicate-argument relations.</Paragraph> <Paragraph position="12"> 4. semantic relations : causality, consequence, qualifier, qualified .... The geometry of the tree corresponds to a phrase structure : the labels of inner nodes are syntactic classes, the labels of leaves are lexical units. Additional information is coded in the attributes of each node.</Paragraph> <Paragraph position="13"> The morphological, syntactic and semantic categories are computed by a morphological analyser written in ATEF. The output of the morphological analyser will be the input of a structural analyser producing multiple outputs in ambiguous cases.</Paragraph> <Paragraph position="14"> Architecture of the interactive translation system A classical machine translation process in the ARIANE system \[Boitet & alii 82, 85\] uses a morphological analysis phase (MA) and an automatic structural analysis phase (SA, on the left of the figure). This phase is replaced with an interactive phase (in the middle). Disambiguation and correction dialogues make calls to paraphrasing and correcting modules. The remainder of the process uses classical automatic transfer steps (LT and ST) and generation steps (SG and MG). On the figure, the existing modules are in bold outline, modules where there exists only a model are in normal outline, specified modules are shaded grey).</Paragraph> </Section> class="xml-element"></Paper>