File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/c92-2081_intro.xml

Size: 6,619 bytes

Last Modified: 2025-10-06 14:05:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-2081">
  <Title>The Automatic Creation of Lexical Entries for a Multilingual MT System</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2. ULTRA
ULTRA (Universal Language TRAnslator)
</SectionTitle>
    <Paragraph position="0"> is a multilingual, interlingual machine translation system which currently translates between five languages (Chinese, English, German, Japanese, Spanish) with vocabularies in each language based on about 10,000 word senses. It makes use of recent AI, linguistic and logic programming techniques, and the system's major design criteria are that it be robust and general in purpose, with simple-to-use utilities for customization. null Its special features include: a multilinguat system with a language-independent system of intermediate representations (interlingual representations) for representing expressions as elements of linguistic acts; Ac'res DE COLING-92. NANTes, 23-28 AOU'r 1992 S 3 2 PRec. OF COLING-92, NANTES, AUG. 23-28. 1992 - bidirectional Prolog grammars for each language incorporating semantic and pragmatic constraints; use of relaxation techniques to provide robustness by giving preferable or &amp;quot;near miss&amp;quot; translations; access to large machine-readable dictionaries to give rapid scaling up of size and coverage; * multilingual text editing within X-windows interface for easy interaction and document preparation in specific domains (e.g., business letters, pro-forlna memoranda, telexes, parts orders).</Paragraph>
    <Paragraph position="1"> Below is a sample screen from the ULTRA system. Each of the Spanish sentences in the &amp;quot;SOURCE TEXT&amp;quot; window have been translatexl into Japanese. The system has ~cut and paste&amp;quot; facilities which allow a sentence from tim source text to be moved to the bottom left &amp;quot;SOURCE SENT:&amp;quot; window where it can then be translated by selecting a target language from the choices above the &amp;quot;TRANSLATION&amp;quot; window (bottom right) and choosing the &amp;quot;TRANSLATE&amp;quot; button at the bottom of the screen. The translation then appears in the bottom right &amp;quot;TRANSLATION&amp;quot; window. From there, the translation can then be moved to the &amp;quot;TARGET TEXT' window.</Paragraph>
    <Paragraph position="3"> The System of Intermediate Representation The interlingual representation (IR) has been designed to reflect our assumption that what is universal about language is that it is used to perform acts of communication: asking questions, describing the world, expressing one's thoughts, getting people to do things, warmng them not to do things, promising that things will get done and so on. Translation, then, can be viewed as the use of the taaget language to perform the same act as that which was performed using the source language. The IR serves as the basis for analyzing or for generating expressions as elements of such acts in each of the languages in the translation system.</Paragraph>
    <Paragraph position="4"> The representation has been formulated oil the basis of an on-going cross-linguistic comparative analysis of hand-generated translations with respect to the kinds of information necessary for selecting the appropriate forms of equivalent expressions in the different languages in the system. We have looked at a number of different types of communication including expository texts, business letters, and e-mail messages and dialogues. This, coupled with the fact that the languages selected for the initial development stage are of differem historical and typological background, has led to a solid foundation for developing a flexible and complete descriptive framework.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
The Language Components
</SectionTitle>
      <Paragraph position="0"> Each individual language system is independent of all other language systems within ULTRA. Corresponding sentences in different languages must produce the same IR and any specific 1R must generate corresponding sentences in the five languages. However, the particular approach to parsing or generation which is used m each of the languages may differ.</Paragraph>
      <Paragraph position="1"> Each language has its own procedures for associating the expressions of the language with the appropriate IRs. These independent systems communicate by handing each other IRs, and no actual transfer takes place.</Paragraph>
      <Paragraph position="2"> Independence of the language-particular systems is of both theoretical mid practical interest. Given the required equivalence of the input-output behavior of each of the language systems, this paradigm is excellent for comparing various approaches to parsing or generation for their coverage aim efficacy.</Paragraph>
      <Paragraph position="3"> A new language may be added to the translation system at any time without unpredictable or negative side effects on the previously developed language systems, or on the system's overall performance.</Paragraph>
      <Paragraph position="4"> Furthermore, the addition of any new language system will have the effect of multiplying the number of language pairs in the translation system by the number of languages already in the system (having developed an English-Japanese system, we need only develop the Spanish module to have an English-Spanish system and a Japanese-Spanish system, and so forth).</Paragraph>
      <Paragraph position="5"> At present, we have developed five prototype language systems for ULTRA. Each system has been implemented in PROLOG as a bidirectional parser/generator. That is to say, in a given language system, the same algorithm is used to do either the analysis or the generation of the expressions of the language.</Paragraph>
      <Paragraph position="6"> The system is capable of handling a wide range of phenomena, including compound and complex sentences, relative clauses, complex noun phrases, questions (yes-no and Wh types) and imperatives. ~lllere will always be certain classes of non-standard input (e.g. &amp;quot;Where station?&amp;quot;) which fall outside the system's normal capabilities and to deal with such irregular input, we are developing a number of techniques which together we call &amp;quot;relaxation&amp;quot;. Our assumption is that if a given stying or IR cannot be successfully processed even though all the lexical items are available in the system, it should be reprocessed with the various constraints systematically weakened.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML