File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/88/a88-1022_intro.xml
Size: 5,354 bytes
Last Modified: 2025-10-06 14:04:37
<?xml version="1.0" standalone="yes"?> <Paper uid="A88-1022"> <Title>A Preliminary Linguistic Framework for Eurotra in : Proceedings of the Conference on Theoretical and Nethodologtcal Issues In Machine Translation of Natural Languages</Title> <Section position="4" start_page="160" end_page="161" type="intro"> <SectionTitle> 2. BASIC DESIGN DECISIONS UNDEELYIN6 TI~ KUROTRA F~RX </SectionTitle> <Paragraph position="0"> The research orientation of Eurotra is motivated by the Intrinsic experimental nature of MMT. On the other hand, we are committed to produce a working prototype MMT system by the end of the programme, whlch should serve as a basis for the development by industry of the next generation of MT systems. These facts, together wlth the previously mentioned size and decentrallzatlon of Eurotra, have determined the design of the system, be It the linguistic recta-language or the underlying software.</Paragraph> <Paragraph position="1"> The Eurotra framework has been described in detail In previous papers {\[t\],\[2\],\[i2\]). Here we will only briefly summarize its distinctive characteristics. The central problem in multlllngual, transfer-based MT derives from the fact that the number of transfer components of such a system grows geometrically with the number of languages covered (72 transfer components for 9 languages}. In order to be feasible at all, such a system requires that the transfer components be Kept small, in principle llmlted to the lexical component. Our problem can be formulated as follows : what ls the nature of the representatlon of a text guaranteeing translational adequacF (&quot;good quality translation&quot;) while allowing simple transfer ? The answer to this question Is not Known, and our central effort In defining the Eurotra framework was to create an experimental environment in which research could be carried out in order to find possible answers.</Paragraph> <Paragraph position="2"> This led us to the design of a system based on multiple, successive representations and mappings between them, leaving the exact definition and the number and nature of such levels to be determined experimentally.</Paragraph> <Paragraph position="3"> Generators The set of legal representations for each level is defined lntenslonally by a generator (in the classical sense} consisting of a set of augmented context free {CF) structure bulldlng rules, or Brules, complemented with a set of feature percolation principles, feature rules or F-rules, and a set of filters, called Killer rules or K-rules, to control overgeneratlon.</Paragraph> <Paragraph position="4"> These rule systems have a declarative operational semantics based on unification, not unlike other current grammar formalisms (C3\], C4\], CT\], \[iO\], \[II\]|.</Paragraph> <Paragraph position="5"> But while some of these formalisms explicitly blur the distinction between structural Information and feature Information (\[3\], \[7\]), we declded to Keep them strlctly separated, like the formalisms described In \[4\] and \[iO\], whlle allowing only one level of recursion In features (set valued features) for reasons of simplicity, formally not unlike the SUBCAT feature of \[10\].</Paragraph> <Paragraph position="6"> This grammar formalism provides us with an easily understandable, powerful and uniform language and, as a consequence, the training of new members of the Eurotra groups poses no problems, an essential asset In a proJect with some 160 participants. Furthermore changes of the virtual machine become acceptable as long as the rule type remains essentially the same, because CF based grammars are normally easy to revise.</Paragraph> <Paragraph position="7"> The current formalism includes devices such as the Kleene star, optlonallty marker, alternation and negation.</Paragraph> <Paragraph position="8"> Figure 1 shows examples of the type of rules currently implemented, with a slightly simplified syntax.</Paragraph> <Paragraph position="9"> Translators A representation at a given level Is mapped onto a representation at the adjacent level by a translation rule {T- null :b-rule:</Paragraph> <Paragraph position="11"> (this last rule expresses the fact that sentences cannot be modified by of-PP) Figure 1 : Examples of rules rule). The translation rules must satisfy two criteria : they must be simple : It must be easier to relate two adjacent levels of representation than to relate text to the semantic representation Input to transfer; furthermore, a source level representation Is mapped directly onto a (set of) target representations without Intermediate steps or representations : we call this the one-sl~ot translatJon pr~ncJple; they must be composJtJonaI : the translation of an object must be a (simple) function of the translation of Its parts; this In order to, on one hand, make the writing of translators modular, and, on the other hand, make the relation between two representational objects established by a set of translation rules understandable for the linguist.</Paragraph> <Paragraph position="12"> Figure 2 shows two examples of translation rules as currently Implemented. The first one Is extracted from the English to German transfer module, while the second one Is part of the Italian analysis component.</Paragraph> <Paragraph position="14"/> </Section> class="xml-element"></Paper>