File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/j04-2004_concl.xml

Size: 5,937 bytes

Last Modified: 2025-10-06 13:53:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="J04-2004">
  <Title>c(c) 2004 Association for Computational Linguistics Machine Translation with Inferred Stochastic Finite-State Transducers</Title>
  <Section position="7" start_page="221" end_page="223" type="concl">
    <SectionTitle>
6. Conclusions
</SectionTitle>
    <Paragraph position="0"> A method has been proposed in this article for inferring stochastic finite-state transducers from stochastic regular grammars. This method, GIATI, allowed us to achieve good results in several language translation tasks with different levels of difficulty. It works better than other finite-state techniques when the training data are scarce and achieves similar results with sufficient training data.</Paragraph>
    <Paragraph position="1"> The GIATI approach produces transducers which generalize the information provided by the (aligned) training pairs. Thanks to the use of n-grams as a core learning procedure, a wide range of generalization degrees can be achieved. It is well-known that a 1-gram entails a maximum generalization, allowing (extended) words to follow one another. On the other hand, for sufficiently large m, a (nonsmoothed) m-gram is just an exact representation of the training strings (of extended words, in our case). Such a representation can thus be considered a simple &amp;quot;translation memory&amp;quot; that just contains the (aligned) training pairs. For any new source sentence, this &amp;quot;memory&amp;quot; can be easily and quite efficiently searched through finite-state parsing. For other intermediate values of n,1&lt;n&lt;m, GIATI obtains increasing degrees of generalization. As in the case of language modeling, the generalization degree (n) has to be tuned so as to take maximum advantage of the available training data. As training pairs become scarce, more generalization is needed to allow GIATI to adequately accept new test sentences.</Paragraph>
    <Paragraph position="2"> This behavior can be clearly observed throughout the results presented in this article.</Paragraph>
    <Paragraph position="3"> Another feature of the GIATI approach is the use of smoothed n-grams of extended words as the basic mechanism for producing smoothed transducers. The combination of this feature with the intrinsic generalization provided by the n-gram modeling itself has proved very adequate to deal with the problem of unseen source (sub)strings.</Paragraph>
    <Paragraph position="4"> Obviously, the overall quality of the generalizations achieved by GIATI strongly relies on the quality of the statistical alignments used and on the way word order is preserved in the source-target strings of each training pair. Taking into account the  Computational Linguistics Volume 30, Number 2 Table 9 Examples of typical errors produced by a 6-gram-based GIATI transducer in the the EuTrans-I task. For each Spanish source sentence, the corresponding target reference and GIATI translations are shown in successive lines.</Paragraph>
    <Paragraph position="5"> 1 ? les importar'ia bajarnos nuestras bolsas a recepci'on ? would you mind sending our bags down to reception ? would you mind sending down our bags to reception ? 2 explique la cuenta de la habitaci'on cuatro diecis'eis .</Paragraph>
    <Paragraph position="6"> explain the bill for room number four one six .</Paragraph>
    <Paragraph position="7"> explain the bill for room number four sixteenth .</Paragraph>
    <Paragraph position="8"> 3 ?cu'anto vale una habitaci'on doble para cinco d'ias incluyendo desayuno ? how much is a double room including breakfast for five days ? how much is a double room for five days including breakfast ? 4 por favor , deseo una habitaci'on individual para esta semana .</Paragraph>
    <Paragraph position="9"> I want a single room for this week , please .</Paragraph>
    <Paragraph position="10"> I want a single room for this week .</Paragraph>
    <Paragraph position="11"> 5 ? le importar'ia despertarnos a las cinco ? would you mind waking us up at five ? would you mind waking us up at five , please ? 6 ? hay televisi'on , aire acondicionado y caja fuerte en las habitaciones ? are there a tv , air conditioning and a safe in the rooms ? is there a tv , air conditioning and a safe in the rooms ? 7 ? tiene habitaciones libres con tel'efono ? do you have any rooms with a telephone available ? do you have any rooms with a telephone ? 8 ? querr'ia llamar a mi taxi ? would you call my taxi , please ? would you call my taxi for me , please ? 9 hemos de marcharnos el veintis'eis de marzo por la tarde .</Paragraph>
    <Paragraph position="12"> we should leave on March the twenty-sixth in the afternoon .</Paragraph>
    <Paragraph position="13"> we should leave on March the twenty-seventh in the afternoon 10 por favor , ? nos podr'ia dar usted la llave de la ochocientos ochenta y uno ? could you give us the key to room number eight eight one , please ? could you give us the key to room number eight oh eight one , please ?  11 quiero cambiarme de habitaci'on .</Paragraph>
    <Paragraph position="14"> I want to change rooms .</Paragraph>
    <Paragraph position="15"> I want to move .</Paragraph>
    <Paragraph position="16"> 12 ? tiene televisi'on nuestra habitaci'on ? does our room have a tv ? does our room ?  Casacuberta and Vidal Translation with Finite-State Transducers finite-state nature of GIATI transducers, certain heuristics have been needed in order to avoid a direct use of too-long-distance alignments (L  in Section 4.2). This has proved adequate for language pairs with not too different (syntactic) structure and more so if the domains are limited. As we relax these restrictions, we might have to relax the not-too-long-distance assumption correspondingly. In this respect, the bilingual word reordering ideas of Vilar, Vidal, and Amengual (1996), Vidal (1997), and Bangalore and Ricardi (2000a) may certainly prove useful in future developments.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML