File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-2097_metho.xml
Size: 26,043 bytes
Last Modified: 2025-10-06 14:12:59
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-2097"> <Title>Cooperation between Transfer and Analysis in Example-Based Framework</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Transfer-Driven Machine Translation </SectionTitle> <Paragraph position="0"> TD/vlT performs efficient and robust spoken-language translation using various kinds of strategies to be able to treat diverse input. Its characteristics are explained in the following sub-sections.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Transfer-centered cooperation mechanism </SectionTitle> <Paragraph position="0"> Translation is essentially converting a source language expression into a target language expression.</Paragraph> <Paragraph position="1"> In TDMT, transfer knowledge consists of various levels of bilingual information. It is the primary knowledge used to solve translation problems. The transfer module retrieves the necess~u-y transfer knowledge ranging from global unit like sentence structures to local unit like words. The retrieval and application of transfer knowledge are flexibly controlled depending on the knowledge necessary to translate the input. Basically, translation is performed by using transfer knowledge. A transfer module utilizes analysis knowledge (syntactic/szmantic infonnation) which helps to apply transfer L,~owledge to some part of the input. And generation and context knowledge are utilized for producing correct translatiou result. In other words, TDMT prodaces translation results by utilizing these dift'cK,~ut kinds of knowledge cooperatively and by centering on transfer, and achieves efficient translation according to the nature of the input.</Paragraph> </Section> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> ACTES DE COL1NG-92, NAN'rES, 23-28 ^otrr 1992 6 4 5 l'~oc, oF COLING-92, N^rcrl~s, AU . 23-28, 1992 </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Utilization of example-based framework </SectionTitle> <Paragraph position="0"> Transfer knowledge is the basic data which is used for totally controlling the translation process.</Paragraph> <Paragraph position="1"> Most of the transfer knowledge in TDMT is described by the example-based framework. An example-based framework is useful for cortsistenfly describing transfer knowledge. The essence of the example-based framework is the distance calculation. This framework achieves the best-match based on the distance between the input and provided examples, and selects the most plausible target expression from many candidates. The distance is calculated quickly because of its simple mechanism. Through providing examples, various kinds and levels of knowledge can be described in the example-based franlework.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 Multi-level knowledge </SectionTitle> <Paragraph position="0"> TDMT provides multi-level transfer knowledge, which correspoods to each translation strategy. In the transfer knowledge of the TDMT prototype system, there is string-, pattern- and grammar-level knowledge.</Paragraph> <Paragraph position="1"> TDMT achieves efficient translation by utilizing multi-level knowledge effectively according to the nature of input.</Paragraph> <Paragraph position="2"> Some conventional machine translation systems also provide multiple levels of transfer knowledge for idioms, syntax, semantics, and so on, and try to apply these levels of that knowledge in a fixed order to cover diverse input \[Ikehara et al. 87\]. However, this method proceeds with the analysis lot deciding which level of knowledge should be applied for any given input sentence in a fixed order, placing heavy load on the analysis module. Also, the knowledge description is ratber more complicated than that of the example-based framework. Therefore, the lrauslation of a simple sentence is not always quick because the system tries to cover all translation strategies.</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Example-based Transfer </SectionTitle> <Paragraph position="0"> TDMT utilizes distance calculation to determine the most plausible target expression and structure in transfer.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 Word distance </SectionTitle> <Paragraph position="0"> We adopt the distance calculation method of Example-Based Machine Translation (EBMT) \[Sumita and lida 91\]. The distance between words is defined as the closeness of semantic attributes in a thesaurus. Words have certain thesaurus codes, which correspond to particular semantic attributes. The distance between the semantic attributes is determined according to the relationship of their positions in the hierarchy of the thesaurus, and varies between 0 and 1 (Fig. 1). The distance between semantic attributes A and B is expressed as d(A, B). Provided that the words X and Y have the semantic attribute A and B, respectively, the distance between X and Y, d(X, Y), is equal to d(A, B).</Paragraph> <Paragraph position="1"> d(A,D~ Figure 1 Distance between thesaurus codes The hierarchy of the thesaurus that we use is in accordance with the thesaurus of everyday Japanese \[Ohno and Hamanishi 84\], and consists of four layers. when two values can be abstracted in the k-th layer from the bottom, the distance k/3 (0 -< k _< 3) is assigned. The value 0 means that two codes belong to exactly the same category, and 1 means that they are unrelated. The attributes &quot;writing&quot; and &quot;book&quot; are abstracted by the immediate upper attribute &quot;document&quot; and the distance is given as 1/3. Thus, the word &quot;ronbun{technical paper}&quot; which has thesaurus code &quot;writing&quot;, and &quot;yokoushuu{proceedings}&quot; which has the thesaurus code &quot;book&quot;, are assigned a distance of 1/3.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 Description of Transfer Knowledge </SectionTitle> <Paragraph position="0"> Transfer knowledge describes the correspondence between source language expressions (SE) and target language expressions (TE) in certain meaningful units, preserving the translational equivalence \[Tsujii and Fujila 91\]. The condition under which a TE is chosen as a translation result of an SE is associated with the TE. Transfer knowledge in an example-based framework is described as follows:</Paragraph> <Paragraph position="2"> Each TE has several examples as conditions. Eij means the j-th example of TEi. The input is the SE's environment, and the most appropriate TE is selected according to the calculated distance between the input and the examples. The input and examples comprise a set of words.</Paragraph> <Paragraph position="3"> Let us suppose that an input I and each example Eij consist of t elements as follows: Au~s DE COIANG-92, NANTES, 23-28 ^otn&quot; 1992 6 4 6 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992</Paragraph> <Paragraph position="5"> Then the distance between I and Eij is calculated as follows:</Paragraph> <Paragraph position="7"> The attribute weight Wk expresses fire importance of the k-th element in the translation 1. The distance from the input is calculated for all examples. Then the example whose distance to the input is least, is detected and the TE which has the example is selected. When Eij is close,st to I, TEl is selected as file most plausible TE. The enrichment of examples increases the accuracy of determining the TE because conditions become more detailed. Further, even if there is only one TE, but there is no example close to the input, the application of the tc, msfer knowledge is rejected.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 Wide application of distance calculation </SectionTitle> <Paragraph position="0"> Distance calculation is usexl to deternfine which TE has the example that is clo~st to the input, ,'rod can be used in various abstract level expressions tlepending on how the input words are provided.</Paragraph> <Paragraph position="1"> Various levels of knowledge can be provided by the wide application of distance calculation. TDMT achieves efficient translation by utilizing multi-level knowledge effectively.</Paragraph> <Paragraph position="2"> In the transfer knowledge of Ihe TDMT prototype system, the string-, pattern- and grammar-level knowledge, the latter two of which can be described easily in an example-based framework, are now adopted. String-level knowledge is the most concrete, while grammar-level knowledge is fire most abstract.</Paragraph> <Paragraph position="3"> Since this kind of knowledge has a condition outside the SE, the cooperation with such as context module is sometimes necessary.</Paragraph> <Paragraph position="4"> In some cases rite conditions can be dcseribed by the examples of the most closely related word in which the SE is used, as follows: sochira => this ((des'u {be}2)...), you ((okuru {send})..), it ((mira {see})...) Applying this knowledge,&quot;you&quot; is selected as the word correspondiag to the &quot;scehira&quot; in &quot;sochira ni{particle} tsutaeru&quot; because of the small distauee between &quot;tsutaeru{convey}&quot; and &quot;okum{send} &quot;. Pattermlevel transler knowledge has variables. The binding words of the variables are regarded as input. For example, &quot;X o o-negaishimasu&quot; {X particle willask-for} has a variable. Suppose that it is translated into two kinds of English expressions in the example lyclow:</Paragraph> <Paragraph position="6"> may 1 speak to X '3 please give me X' ((jimukyoku{office}), ...), ((hangou(number}),...) In the translation of &quot;X o o-negaishimasu&quot;, the TE is detennined by the calculation below: if Min (d((X), (jimukyoku)),....) < Min (d((X), (bangou)),....) then tile TE is &quot;may I speak to X' else the TE is &quot;pl-ease give me X' &quot; Tire following two sentences have the pattern &quot;X o o- null negaishimasu&quot;: (1) &quot;jinjika{personnel section} o o-negaishimasu.&quot; (2) &quot;daimei {title} o o-negaishimasu.&quot; The first sentence select,; &quot;may I speak to X' &quot; because (jinjika) is close to (jimukyoku). The second sentence selects &quot;please give me X' &quot; ,because (tlaimei) is close to (bangou). Thus, we get the following translations: (1') &quot;may I speak to the l)ersnnuel section.&quot; (2') &quot;please give me file title,&quot; 3.3.3 Gramntar-level transfer knowledge Grammar-level transfer knowledge is expressed in terms of grammatical categories. The examples consist of sets of words which are concrete instances of each category. The following transfer knowledge involves sets of three common nouns (CNs): 3A' is the transferred expression of A ((&quot;kaigi, kaisai, kikan {confereltce, opening, time} &quot;),...), CN2' CN3' for CNI' ((&quot;sanka, moushikomi, youshi {participation, application, form } &quot;),...), This transfer knowledge allows the following translations.</Paragraph> <Paragraph position="7"> kenkyukai kaisai kikan {workshop, opening, time) o> file time of the workshop happyou moshikomi youshi {presentation, application, form} -> the application form for presentation The above translations select &quot;CN3' of CNI' &quot; and &quot;CN2' CN3' for CNI' &quot; as the most plausible TEs, as the result of distance calculations.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.4 Disambiguation by total distance </SectionTitle> <Paragraph position="0"> When there are several ways to apply transfer knowledge to the input sentence, structural ambiguity may occur. In such cases, the most appropriate structure is selected on the basis of total distance. The least total distance implies that the chosen structure is the most suitable input structure. For example, when the pattern &quot;X no Y&quot; is applied to the sentence &quot;kaigi no touroku hi no waribiki {conference, particle, registration, fee, tkar ticle, discount} &quot;, there are two ppssible structures: (1) kaigi no (touroku hi no waribiki) (2) (kaigi no touroku hi) no waribiki The pattern &quot;X no Y&quot; has various TEs, such as in the following</Paragraph> <Paragraph position="2"> The respective TE tree representations constracted from structures (1) and (2) are shown in Figs. 2 and 3.</Paragraph> <Paragraph position="3"> The structure of (1) transfers to &quot;Y' of X' &quot; with the distance value of 0.50 and &quot;Y' of X' &quot; with the distance value of 0.17, and generates (1') with a total distance value of 0.67. In structure (2), &quot;Y' of X' &quot; with the distance value of 0.17 and &quot;Y' for X'&quot; with the distance value of 0.17, generates (2') with a total distance value of 0.34. The latter result is selected because it has the least total distance va~ue.</Paragraph> <Paragraph position="4"> (1') &quot;discount of the regiswation fee of the conference&quot; (2') &quot;discount of registration fee for the conference&quot; discount of registration fee of the conference</Paragraph> <Paragraph position="6"/> <Paragraph position="8"/> </Section> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Example-based Analysis </SectionTitle> <Paragraph position="0"> For some structurally complex sentences, translations cannot be performed by applying only transfer knowledge. In such cases, analysis knowledge is also required. The analysis module applies analysis knowledge and supplies the resulting information to the transfer module, which then applies transfer knowledge on the basis of that information. When no analysis knowledge is necessary for translation, the application of only transfer knowledge produces the translation result. The analysis described in this paper is not the understanding of structure and meaning on the basis of a parsing of the input sentence according to grammar rules, but rather the extraction of the information ACq'ES DE COLING-92, NANfES. 23-28 AOfrr 1992 6 4 8 PROC. OF COLING-92. N^.,wrEs, AUG. 23-28, 1992 required to apply transfer knowledge and to produce the correct translation from the input sentence.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Description of analysis knowledge </SectionTitle> <Paragraph position="0"> Analysis knowledge is described by examples in the same way as transfer knowledge, as follows:</Paragraph> <Paragraph position="2"> Although the form of knowledge description is virtually the same, transfer knowledge descriptions map onto TEs, whereas analysis knowledge descriptions map onto revised SEs.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Cooperation mechanism </SectionTitle> <Paragraph position="0"> The transfer and analysis processes operate autonomously but cooperatively to produce the translation result shown in Figure 4.</Paragraph> <Paragraph position="1"> At present, we are providing analysis knowledge for normalization \[Nagao 84\] and for structuring with TDMT. In the following sections we will explain the cooperation mechanism between transfer and analysis based on these two kinds of analysis knowledge. Normalization is putting together minor colloquial expressions into standard expressions It leads to robust translation and efficient knowledge storage. Analysis knowledge for normalization is utilized to recover the ellipsis of function words such as particles, and to normalize some variant forms such as sentence-final forms into normal forms. Such knowledge helps the application of transfer knowledge to the input sentence. The sentence &quot;Watakushi wa Suzuki desu {I, particle, Suzuki, complementizer}&quot; is uanslated into &quot; 1 am Suzuki&quot; by applying transfer knowledge such as the following:</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> XwaYdesu =>X'beY' </SectionTitle> <Paragraph position="0"> However, in spoken Japanese, particles are frequently omitted. The sentence &quot;Watakushi Suzuki desu&quot; is natural spoken-Japanese. It is normalized to &quot;Wataknshi wa Suzuki desu&quot;, which has the omitted particle &quot;wa&quot; recovered, hy applying the following analysis knowledge: Pronoun wa Proper-Noun (a set of examples) The analysis module sends the information about tile application of the analysis knowledge to the transfer module. The transfer module receives the information and applies the transfer knowledge to produce the English sentence &quot; I am Suzuki&quot; By examples, tbis kind of analysis knowledge cau also classify the particles to be recovered as shown ((kaigi{confemnce}, sanka-suru {i)articipate}),...), This analysis knowledge allows the recovery of various particles such as, &quot;hoteru yoyaku-suru&quot; -> &quot;hotern o yoyaku-suru&quot; &quot;kaigi sanka-suru&quot;-> &quot;kaigi ni sanka-suru&quot; Analysis knowledge for nomlalization also has the advantage of making file scale of knowledge more economical and the translation processing more robust. Structuring is recognition of structure components of by insertion of a marker in order to apply transfer knowledge to each structure component. Analysis knowledge for structuring is applied to detect special linguistic phenomena such as adnominal expressions, wh-expressions, and di~ontinuities, so as to assign a structure to the SE.</Paragraph> <Paragraph position="1"> Adnominal expressions appear with high frequency in Japanese, corresponding to various English expressions such as relative clauses, infinitives, pronouns, gerunds, and subordinate clauses. They can be detected by means of inflectional forms. Three components of adnominal expressions must be considered in the translation process: the modification relationship, the modifier, and AcrEs DE COLING-92, NANTES. 23-28 ho(rr 1992 6 4 9 PROC. OF COLING-92, NANTes, Ate. 23-28, 1992 the modified. Analysis information for structuring is used to insert a marker at the boundary between the modifier and the modified. The following analysis knowledge can be constructed.</Paragraph> </Section> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> Adnominal-inflection CN => Adnominal-inflection Adnominal-marker CN </SectionTitle> <Paragraph position="0"> (a set of examples) This knowledge identifies adnominal relationships and separates the modifier from the modified so that transfer knowledge can be applied. When the transfer module receives the information about the application of this analysis knowledge, it applies the transfer knowledge needed to translate each component of the expression: the adnominal relationship, the modifier, and the modified. The scope of the modifier and the modified is determined by the total distance of each structure in which transfer knowledge is applied.</Paragraph> <Paragraph position="1"> The following transfer knowledge about the adnominal relation determines the English expression by distance calculation with examples before and after the marker as follows:</Paragraph> <Paragraph position="3"> For example, analysis knowledge is applied to &quot;Kyoto eki e iku basu{Kyoto station particle go bus}&quot;, and the revised SE &quot;Kyoto eki e iku Adnominal-marker basu&quot; is produced. Then, by the application of the above transfer knowledge about the adnominal relation and the following transfer knowledge about the modifier and modified, the translation result &quot;the bus that goes to the Kyoto station&quot; is produced.</Paragraph> <Paragraph position="5"/> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 TDMT Prototype System </SectionTitle> <Paragraph position="0"> A prototype Japanese to English system constructed too confirm the feasibility and effectiveness of TDMT is running on a Genera 8.1 LISP machine \[Furuse and lida 92\].</Paragraph> <Paragraph position="1"> Due to the restriction of the sequential mechanism, a method for driving the necessary process at the required time has not been completely achieved. However, the following control mechanism is used to obtain the most efficient processing possible.</Paragraph> <Paragraph position="2"> * As much as possible, translation is attempted by first applying only transfer knowledge; when this fails, the system tries to apply analysis knowledge.</Paragraph> <Paragraph position="3"> * Transfer knowledge is applied at the most concrete level as possible, that is, in the order of string, pattern, and grammar level.</Paragraph> <Paragraph position="4"> In order to achieve flexible processing which exchanges necessary translation information, a parallel implementation is under study based on the results from the prototype system.</Paragraph> <Paragraph position="5"> The knowledge base has been built from statistical investigation of the bilingual corpus, whose domain is inquiries concerning international conference registration. The corpus has syntactic correspondences between Japanese and English. We have established transfer and analysis knowledge as follows:</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> 6 Evaluation </SectionTitle> <Paragraph position="0"> We have evaluated the TDMT prototype system, with the model conversations about conference registration consisting of 10 dialogs and 225 sentences. The model conversations cover basic expressions. Table 1 shows the kinds of knowledge that were required to translate the model conversations.</Paragraph> <Paragraph position="1"> Table 1 Knowledge Necessary to Translate</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Model Conversation </SectionTitle> <Paragraph position="0"> ( total number of sentences - 225) sentences rate string only 73 32.4% pattern and string only 90 40.0% grammar-level 21 9.3% transfer knowledge needed analysis knowledge needed 41 18.2% At present, the prototype system can produce output quickly by the example-basod framework. 200 of the sentences are correct, providing a success rate of 88.9%. The coverage by string- and paaem-level knowledge is wider than expected.</Paragraph> <Paragraph position="1"> Table 2 shows the main causes of incorrect sentences. ACT~ DE COLING-92, NANTES, 23-28 AOt~r 1992 6 5 0 PROC. OF COL1NG-92. NANTES, AUG. 23-28, 1992 Table 2 Causes of Incorrect Sentences (total number of incorrect sentences - 25) oct urrence~ (1) inability to get such TEs 9 as elided objects (2) selection of incorrect TEs 8 (3) error in adverb position 4 (4) incorrect declension 1 (5) incorrect tense 1 (6) etc 2 The second factor shows that an elaboration of distance calculation and an enrichment of examples are needed. The first, third, and fourth factors are caused by the shortage of generation knowledge. The fifth factor is caused by the shortage of analysis knowledge. These facts show that the cooperative control that flexibly communicates various kinds of knowledge including context mid generation knowledge, and various kinds of frameworks such as a rule-based and a statistical framework are useful to improve the translation performance.</Paragraph> </Section> </Section> <Section position="10" start_page="0" end_page="0" type="metho"> <SectionTitle> 7 Related Research </SectionTitle> <Paragraph position="0"> The example-based approach was advocated by Nagao \[Nagao 84\]. The essence of this approach is (a) retrieval of similar examples from a bilingual database and (b) applying the examples to translate the input. Other research has emerged following this line, including EBMT \[Sumita and Iida 91\], MBT \[Sate and Nagao 90\], and ABMT \[Sadler 89\]. EBMT uses phrase examples and will be integrated with conventional rule-based machine translation. MBT and ABMT use example dependency trees of examples and translate the whole sentence by matching expressions and by a left-to-right search of maximal matching. TDMT utilizes an example-based framework for various process as the method of selecting the most suitable TE, and combines multi-level transfer knowledge. On the other lmnd, MBT and ABMT utilize uni-level knowledge only h~r transfer.</Paragraph> </Section> <Section position="11" start_page="0" end_page="0" type="metho"> <SectionTitle> 8 Concluding Remarks TDMT (Transfer-Driven Machine Translation) has </SectionTitle> <Paragraph position="0"> been proposed. The prototype TDMT system which translates Japanese to English spoken dialogs, has been constructed with an example-based framework. The consistent description by example smoothes the cooperation between transfer and analysis, have shown the high feasibility. Important future work will include the achievement of flexible translation which effectively control the translation process. Also important is the implementation of TDMT in distributed cooperative processing by a parallel computer and incorporating various kinds of processing such as rule-based and statistical framework into the cooperation mechanism.</Paragraph> </Section> class="xml-element"></Paper>