File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/86/c86-1025_abstr.xml

Size: 7,261 bytes

Last Modified: 2025-10-06 13:46:18

<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1025">
  <Title>TRANSFER AND MT MODULARITY</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
TRANSFER AND MT MODULARITY
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="115" type="abstr">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> The t cansfer components of typical second generation (G2) MT systems do not fully conform to the principles o~ G2 modularity, incorporating extensive target language information while failing to separate translation facts from linguistic theory.</Paragraph>
    <Paragraph position="1"> The exclusion from transfer of all non-contrastive information \[eads us to a system design in which the three major components operate in parallel, rather than in seqnence. We also propose that MT systems be designed to allow translators to express their knowledge in natural metalanguage statements.</Paragraph>
    <Paragraph position="2"> I. Modularity: a Basic Principle of G2 S~stems Modularity is a defining characteristic of second genera ti.on machine translation systems (hereafter C2 MT). G2 systems are claimed to be based on a mode\] in which linguistic descriptions are clearly separated from the algorithms and programs that actually produce translations. Moreover, in this mode\]., the linguistic facts that pertain solely to the source language (SL) are supposed to be clearly separated from the facts that pertain solely to the target language (TL), and from those facts that concern the lexlcai and structural contrasts between TL and S\],. Such, at Least, are the principles of G2 design, as set fortb for example in Vauquois \[i \] . This conception of MT gives rise to systems composed of three distinct and successive phases : a monolingual analysis component, which produces a SL-dependent structural description (SD) of the input text; a transfer component, which maps that SD onto a TL-dependent SD; and a monolingual synthesis component, which transforms that SD into a TL output text.</Paragraph>
    <Paragraph position="3"> As pointed out by Kay \[2\], this classical G2 design offers a number of advantages. Ideally, it should allow the formal\[ description of a given language to serve the needs of analysis and synthesis indifferently. As well, it should allow a given analysis or synthesis component to be coupled onto other MT systems of simi.lar design to produce translations for other language pairs. And finally, because the algorithms are independent of the particular linguistic descriptions, they too should be reusable in other MT applications.</Paragraph>
    <Paragraph position="4"> il. Transfer in T_yj~ical G2 Systems Transl!er in G2 systems like TAUM-AVIATION \[3\], ARIANE-78 \[4\] and METAL \[511 is essentially a treetransformation system, relating the SDs of two complete translation unlts (generally sentences).</Paragraph>
    <Paragraph position="5"> Lexieal units (LUs) are not translated in isolation; rather, transfer rules typically test the structural environment of each SL LU and, after inserting the appropriate TL equivalent, may rearrange that structure to accord with contextual\[ constraints imposed by the TL LU. Details of formalization aside, transfer rules ia these systems will encode facts like the following :</Paragraph>
    <Paragraph position="7"> know conna2tre Such transfer rules are usually deterministic: each input tree (or subtree) is mapped onto one and only one output tree. Consequently, corresponding to each SL SD produced by the analysis component, the synthesis component will receive one and only one TL SD. In order for a correct TL sentence to be produced, this unique structure has to be correct in all respects.</Paragraph>
    <Paragraph position="8"> This conception makes it necessary for the transfer component to encode a lot of knowledge about TL grammar. \]for example, if the input tree required by synthesis is an Aspects-type deep structure, transfer will need to include the equivalent of the complete base component of the TL grammar, including \].exicaL insertion mechanisms.</Paragraph>
    <Paragraph position="9"> A close examination of the transfer components of the three systems mentioned above shows that this is indeed the ease. Rules such as (la-b) clearly state facts that belong to a description of TL: namely, strict subcategorizatLon conditions on the insertion of savoir and connaltre. A sizable portion of the so-called transfer rules in these systems actually deals with TL strict subcategori zation requirements.</Paragraph>
    <Paragraph position="10"> In addition to incorporating TL linguistic descriptions within transfer, these systems also make use of impoverished TL grammars for synthesis. For example, none of the systems mentioned accesses a full-fledged TL dictionary during synthesis. It has long been a truism in MT circles that synthesis is much easier that analysis. This is hardly surprising, when so nmch of tile work of synthesis has been passed over to transfer. This move has at least  two unfortunate consequences: (a) the burden placed on the translator/lexicographer is greatly increased; and (b) the translation relation is implemented in a highly directional fashion. Analysis and synthesis become totally different tasks, each standing in a different relationship to transfer.</Paragraph>
    <Paragraph position="11"> In order to remedy these defects, we favour a more rigourous adherence to the principles of G2 modularity: all information which is not strictly contrastive should be removed from the transfer component and returned to synthesis, where it belongs. This would allow G2 systems to fully benefit from the advantages of modularity that Kay has noted. It would also ease the burden on those who must write the transfer rules.</Paragraph>
    <Paragraph position="12"> On the other hand, such a move would make the transfer phase non-deterministic, and this may lead to severe efficiency problems. If, as in typical G2 systems, transfer and synthesis are applied in sequence, communication between the two components will be effected through tree structures representing complete translation units. The use of non-deterministic rules at transfer would mean that local translation ambiguities would then generalize to these complete units. For exalaple, if tranfer no longer selects between savoir and connagtre for the translation of know, or between cheveux and ~ for the translation of hair, then at least four different French SDs will be transmitted to the synthesis component for any SL SD containing both know and hair.</Paragraph>
    <Paragraph position="13"> One possible solution to this problem is to have analysis, transfer and synthesis operate in parallel rather than in sequence, while providing the three components with the means to exchange partially specified tree structures at each stage of the processing. This is probably what Arnold et al \[6\] have in mind when they suggest that transfer should be viewed as a relation between two generating devices.</Paragraph>
    <Paragraph position="14"> We have recently begun developing a system that seeks to implement such an approach. We have chosen</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML