File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/02/w02-1613_abstr.xml
Size: 8,469 bytes
Last Modified: 2025-10-06 13:42:41
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-1613"> <Title>Automatic Information Transfer Between English And Chinese</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> The translation choice and transfer modules in an English Chinese machine translation system are introduced. The translation choice is realized on basis of a grammar tree and takes the context as a word bag, with the lexicon and POS tag information as context features. The Bayes minimal error probability is taken as the evaluation function of the candidate translation. The rule-based transfer and generation module takes the parsing tree as the input and operates on the information of POS tag, semantics or even the lexicon.</Paragraph> <Paragraph position="1"> Introduction Machine translation is urgently needed to get away with the language barrier between different nations. The task of machine translation is to realize mapping from one language to another. At present there are three main methods for machine translation systems [Zhao 2000]: 1) pattern/rule based systems: production rules compose the main body of the knowledge base. The rules or patterns are often manually written or automatically acquired from training corpus; 2) example based method. The knowledge base is a bilingual corpus of source slices S' and their translations T' Given a source slice of input S, match S with the source slices and choose the most similar as the translation or get the translation from it. 3) Statistics based method: it is a method based on monolingual language model and bilingual language model.</Paragraph> <Paragraph position="2"> The probabilities are acquired from large-scale (bilingual) corpora.</Paragraph> <Paragraph position="3"> Machine translation is more than a manipulation of one natural language (e.g.</Paragraph> <Paragraph position="4"> Chinese). Not only the grammatical and semantic characteristics of the source language must be considered, but also those of the target language. To sum up, the characteristics of bilingual translation is the essence of a machine translation system.</Paragraph> <Paragraph position="5"> A machine translation system usually includes 3 sub-systems [Zhao 1999] : (1) Analysis: to analyse the source language sentence and generate a syntactic tree with syntactic functional tags; (2) Transfer: map a source parsing tree into a target language parsing tree; (3) Generation: generate the target language sentence according to the target language syntactic tree.</Paragraph> <Paragraph position="6"> The MTS2000 system developed in Harbin Institute of Technology is a bi-directional machine translation system based on a combination of stochastic and rule-based methods. Figure 1 shows the flow of the system.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Input English Sentence Morphology Analysis Syntactic Analysis Word Translation Choice Transfer and Generation Output Chinese Sentence Figure 1 Flowchart of MTS2000 System </SectionTitle> <Paragraph position="0"> Analysis and transfer are separated in the architecture of the MTS2000 system. This modularisation is helpful to the integration of stochastic method and the rule based method.</Paragraph> <Paragraph position="1"> New techniques are easier to be integrated into the modularised system. Two modules implement the transfer step and the generation step after analysis of the source sentence. The specific task of transfer and generation is to produce a target language sentence given the source language syntactic tree. In details, given an English syntactic tree (e.g. S[PP[ In/IN</Paragraph> <Paragraph position="3"> using knowledge sources such as grammatical features, simple semantic features, construct a Chinese syntactic tree, whose terminal nodes compromise in sequence the Chinese translation.</Paragraph> <Paragraph position="4"> The input sentence are analysed using the morphology analyser, part-of-speech tagger, and syntactic analyser. After these steps, a syntactic parsing tree is obtained which has multiple levels with functional tags [Meng 2000].</Paragraph> <Paragraph position="5"> Followed is the parser flow: Figure 2. Parser based on Hybrid Methods At present, our English parser is able to generate syntactic tree i way. The English parsing tree, with t information about relations in the source sentence, information of the nodes, i of transfer and generation. The infor the nodes is the starting point of transfer and generation. After syntactic parsing, th transfer and generation includes word translation choice of ambiguous adjustment and insertion/deletion o functional words. Transfer and generation are implemented using two m word translation choice, the other for structure transfer and translation m 1 Parsing Based Transl First we will give a for translation choice in m [Manning 1999]: Suppose the source sentence to be translated to be ES. In the sentence the ambiguous word EW has M target translations CW1, CW2, ... CWM. And the translations occurs in a specific context C with probabilities</Paragraph> <Paragraph position="7"> the Bayes minimum error probability formula,</Paragraph> <Paragraph position="9"> Generally when the condition fulfills P(CW1|C)>P(CW2|C)>...>P(CWM|C), we may choose CW1 as the translation for EW. From the Naive Bayes formula:</Paragraph> <Paragraph position="11"> Where P(CWk) denotes the probability that CWk occurs in the corpus; P (vj |CWk) denotes the probability that the context feature vj co-occurs with translation CWk.</Paragraph> <Paragraph position="12"> A general algorithm of supervised word sense disambiguation is as follows: 1. comment: Training 2. for all senses sk of w do 3. for all words vj in the vocabulary do 4. P(vj|sk) = C(vj, sk)/C(vj) 5. end 6. end 7. for all senses sk of w do 8. P(sk) = C(sk)/C(w) 9. end 10. comment: Disambiguation 11. for all sense sk of w do 12. Score(sk) = logP(sk) 13. for all words vj in the context window c do 14. score(sk) = score(sk) + logP(vj|sk) 15. end 16. end 17. choose s' = argmaxskscore(sk) From the above formal description we can see that the key to the stochastic word translation is to select proper context and context features Vj. Present methods often define a word window of some size, i.e. to suppose only words within the window contributes to the translation choice of the ambiguous word. For example, [Huang 1997] uses a word window of length 6 words for word sense disambiguation; [Xun 1998] define a moveable window of length 4 words; [Ng 1997] uses a word window with offset +- 2. But two problems exist for this method: (1) some words that are informative to sense disambiguation may not be covered by the window; (2) some words that are covered by the word window really contribute nothing to the sense choice, but only bring noise information. After a broad investigation for large-scale ambiguous words, we choose the context according to the correlation of the context words with the ambiguous word, but not only the distance from the word.</Paragraph> <Paragraph position="13"> From the above analysis, we choose the translation choice method based on syntactic analysis. Place the module of translation choice between the parser and the generator; acquire a context set for the ambiguous word. When choosing the translation, we may take the context set as a word bag, i.e. the grammatical context as word bag. No single word is considered but only that lexical and part-of-speech information are taken as context features. Bayes minimum error probability is taken as evaluation function for word translation choice.</Paragraph> <Paragraph position="14"> In this paper, grammatical context is considered for word translation choice. The structure related features of the ambiguous words are taken into account for fully use of the parsing result. It has the characteristics below: (1) The window size is not defined by human but on basis of the grammatical structure of the sentence, so we can acquire more efficiently the useful context features; (2) The unrelated context features in sentence structure are filtered out for translation choice; (3) The features are based on the structure relationship, but not 100% right parsing result. From the above characteristics, we can see the method is really practical.</Paragraph> </Section> </Section> class="xml-element"></Paper>