File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/c96-2178_intro.xml

Size: 6,987 bytes

Last Modified: 2025-10-06 14:06:03

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-2178">
  <Title>Machine Translation Method Using Inductive Learning with Genetic Algorithms</Title>
  <Section position="3" start_page="0" end_page="1021" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> A practical and high quality method of machine translation is important for the internationalization of Japanese society. Many studies have been carried out on machine translation. The rule-based machine translation (John and Harold, 1992) could not deal adequately with various linguistic phenomena due to use only limited rules.</Paragraph>
    <Paragraph position="1"> To resolve this problem, example-based machine translations (Sato and Nagao, 1990; Akama and Ichikawa, 1979; Stanfill and Waltz, 1986; Sumita et al., 1993) have recently been proposed. However, these methods require many translation examples to realize a practical and high quality translation.</Paragraph>
    <Paragraph position="2"> The goal of our research is to design a computer system with the same capability of language and knowledge acquisition as human beings (Araki and Momouchi, 1994; Araki et al., 1995).</Paragraph>
    <Paragraph position="3"> In this paper, we propose a method of machine translation using inductive learning with genetic algorithms, The genetic algorithms (Goldberg, 1989) imitate tile evolutiona W process which repeats generational replacenmnt to adapt to the environment. The purposes are to establish various high quality translation rules from only a small amount of data, and produce high quality translation results. The system is expected to continuously evolve to higher learning and translation capability.</Paragraph>
    <Paragraph position="4"> In this paper, we describe a method of machine translation using inductive learning with genetic algorithms, and show through the results of evaluation experiments that genetic algorithms are effective for the example-based machine translation.  2 Processes in the New Method 2.1 Outline of tile Translation Method  Figure 1 shows the outline of the new translation method. In this paper, we describe the process of English-Japanese translation as one possible application of this method. First, a user inputs a source sentence in English. Second, in the translation process, the system produces several candidates of translation results using translation rules extracted in the learning process. Third, the user proofreads the translated sentences. Fourth, in the feedback process, the system determines the fitness value and performs the selection process of translation rules. In the learning process, new translation examples are automatically produced by crossover and mutation, and various translation rules are extracted fl'om the translation examples by inductive learning, A translation exampie includes the source sentence and a translated  sentence. There are two kinds of translation rules: those for sentences and those for words. The former are called sentence translation rules and the latter word translation rules. Rel)etition of the above mentioned processes corresponds to generational replacement of the whole of system, and the system continuously evolves to higher quality translation.</Paragraph>
    <Section position="1" start_page="1020" end_page="1020" type="sub_section">
      <SectionTitle>
2.2 Chromosome and gene
</SectionTitle>
      <Paragraph position="0"> As shown in Figure 2, a chromosome corresponds to a translation example which consists of English and Japanese sentence, and a gene corresponds to a word. In this paper, Japanese words are written ill italics.</Paragraph>
    </Section>
    <Section position="2" start_page="1020" end_page="1020" type="sub_section">
      <SectionTitle>
2.3 Feedback Process
</SectionTitle>
      <Paragraph position="0"> First, the system evaluates the translation results using the translated sentences which have been proofread. The system adds one to the con-cot translation fi'equency when the translation results have the same character strings as the proofl'ead translation results, and adds one to the erroneous translation frequency when the translation rules have different character strings to the proofread translation results. Second, the system determines the fitness value of the rules used in translation using these correct and erroneous translation frequencies. The fitness value is calculated by tile fitness fimction as follows:</Paragraph>
      <Paragraph position="2"> The correct translation t'requcncy The number of uses x 100 (1) Third, tile system performs tile selection process using the fitness value. The conditions of the selection process are that the nunlber of uses is over 5 and the fitness value is under 25%. These thresholds were determined by preliminary experiment. null</Paragraph>
    </Section>
    <Section position="3" start_page="1020" end_page="1021" type="sub_section">
      <SectionTitle>
2.4 Learning Process
</SectionTitle>
      <Paragraph position="0"> In this process, new translation examples are alttomatically produced by crossover and mutation.</Paragraph>
      <Paragraph position="1"> In crossover, two translation examples which have common parts are selected. C, rossover l)ositions are the conHlion parts~ and one-t)oiut crossovers for these two translation examples are performed.</Paragraph>
      <Paragraph position="2"> These one-poillt crossovers use each conllllOn part of tile English and Japanese sentences. Figure 3 shows examples of crossover. In Figure 3, &amp;quot;likes&amp;quot; is tile common part in the two English sentences, and &amp;quot;wa&amp;quot; and &amp;quot;ga suki desu&amp;quot; are the COmlnon parts in the two Japanese sentences. Therefore, &amp;quot;likes&amp;quot; and &amp;quot;wa&amp;quot; are the crossover positions. One-point crossovers are performed, producing two translation examples. Next, one-point crossovers are pertbrmed for &amp;quot;likes&amp;quot; and &amp;quot;9a suki desu&amp;quot;. However, the translation examples which are produced have the same character strings as tile source sentences, and therefore, these translatio, examples are not inputted into tile dictionary. Translation exampies are randomly changed by nmtation, at a rai, e of 2%. New translatio, examples are also produced by replacing the words of translation examples with those of translation rules.</Paragraph>
      <Paragraph position="3"> The system extracts the common and different parts fl'om the character strings of all translation examples. These common and different parts are used as translation rules.</Paragraph>
    </Section>
    <Section position="4" start_page="1021" end_page="1021" type="sub_section">
      <SectionTitle>
2.5 Translation Process
</SectionTitle>
      <Paragraph position="0"> In this process, the system produces several candidates of translation results for a source sentence using extracted translation rules. This process also uses genetic algorithm. The details of this process are as follows:</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML