File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/82/c82-1039_abstr.xml

Size: 3,604 bytes

Last Modified: 2025-10-06 13:45:57

<?xml version="1.0" standalone="yes"?>
<Paper uid="C82-1039">
  <Title>AN ENGLISH JAPANESE MACHINE TRANSLATION SYSTEM OF THE TITLES OF SCIENTIFIC AND ENGINEERING PAPERS Makoto Nagao, Jun-ichi Tsujii (Kyoto University)</Title>
  <Section position="1" start_page="0" end_page="245" type="abstr">
    <SectionTitle>
JAPAN
</SectionTitle>
    <Paragraph position="0"> The title sentences of scientific and engineering papers are analyzed by simple parsing strategies, and only eighteen fundamental sentential structures are obtained from ten thousand titles. Title sentences of physics and mathematics of some databases in English are translated into Japanese with their keywords, author names, journal names and so on by using these fundamental structures. The translation accuracy for the specific areas of physics and mathematics from INSPEC database was about 93%.</Paragraph>
    <Paragraph position="1"> i. INTRODUCTION There have been many researches on syntactic analysis of natural language by computer, but still no reliable grammatical rules are established yet which can be applicable to any utterances of a language. Universal grammatical rules for a language looks like almost hopeless. Grammatical rules to be prepared depend heavily on the text to be analyzed. Hence the concept of subgrammar is introduced. It does not necessarily cover all the different kinds of sentential structures of a language. A grammar which covers just the set of expressions to be treated is sufficient from the engineering point of view.</Paragraph>
    <Paragraph position="2"> We developed a machine translation system which translates the titles of scientific and engineering papers from English into Japanese. More than 98% of the titles in scientific and engineering papers are noun phrases, so that the system is designed to translate only the noun phrases. The~verbs can be used in the forms of to + infinitive , verb-ing, and verb-ed. The system can not treat the embedded sentences which are introduced by relative pronouns.</Paragraph>
    <Paragraph position="3"> Then the essential structures the system can treat are composed of simple noun phrases, verbs of the forms of to-infinitive, verb-ing, and verb-ed, and prepositions. Here a simple noun phrase means the juxtaposition (endocentric structure) of adjectives nouns, and some other elements. The word order of a simple noun phrase can be the same in English and Japanese. The sentential structures obtained after parsing each simple noun phrase into a noun is called a skeleton pattern. We can expect that the variety of such skeleton patterns is very few for the restricted area of titles of scientific and engineering papers.</Paragraph>
    <Paragraph position="4"> When the variety is very few, we do not need further syntactic analysis for these skeleton patterns. For each skeleton pattern the corresponding Japanese skeleton pattern (word order change) can be given. Thus the subgrammar in this system is a very peculiar one which is an accumulation of heuristics of the title structures. We utilized this specific nature of the titles in our machine translation system. The correct translation rate for the wide variety of scientific and engineering papers is about 80%, but for the specific areas of physics and mathematics from INSPEC database the score was about 93%. The system is now used for the conversion  246 M. NAGAO et al.</Paragraph>
    <Paragraph position="5"> of English databases into Japanese databases. This system thus opened a way for the Japanese people to make access to English databases in their own language.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML