File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/c86-1154_metho.xml

Size: 14,198 bytes

Last Modified: 2025-10-06 14:11:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1154">
  <Title>QUBLIFIERS (:PG (:PREP &amp;quot;mit&amp;quot;) (=POBJ (:NG (:HERO &amp;quot;Sqstem&amp;quot;) (:FEBTUREB (:BET INDEF) (:COS DnT)) (:POSSRTTR (=PG (:PREP &amp;quot;zu&amp;quot;) (:POBJ (=NG (~HERD &amp;quot;UnterstOtzunq&amp;quot;) (:FEflTUREB (INUM SG) (~DET DEF)) (=POSSBTTR (:NG (:HERD &amp;quot;EnLgicklunq&amp;quot;) (=FEBTUBEO (:BET DEF) (:BUff SG)) (:QUBLIFIERS</Title>
  <Section position="1" start_page="0" end_page="653" type="metho">
    <SectionTitle>
WesL Germ~H~y
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper we will report on our experiences from a 2 1/2 year project that designed and implemented a prototypical Japanese to German translation system for titles of Japanese papers.</Paragraph>
    <Paragraph position="1"> Background An american study published in Nature, 308 (1984) - evaluated cir. 9000 Japanese scientific papers. 75 percent of them are published exclusively in Japanese, only a 5th of Japanese papers are currently evaluated from Western refereeing and information services. The main conclusion of the study was, that the general opinion all important Japanese stuff would be published in English is not true, at least for the applied sciences. From this background and from the Japanese success in a lot of fields of modern technologies stems a wider interest in having access to Japanese material and in having help to overcome the language barrier. Die Inforaationstachnologie end ihr EinfluO auf die Rusbildung in don UGR. )is Graphgraamatik ale Gonerierungo-gerkzeug balm Vorstehen yon Rildern. in Terminal mit hochgortigen Graphik-Funkti0non~ des air sines eehrfachon rozassor raalistert gird.</Paragraph>
    <Paragraph position="2"> Faktoran zur Beoinflussung van gartungen und Vorbesearungan uon Progromman. Die Struktur dos Dialogs zwiechan Sgstaa-Inganiaur und Saft~are-Inganlauro Dis Entuicklung van gerkzaugen zur Einach~tzuno der Vorarbeltungsleistung * Der Gtandpunkt dos Managers.</Paragraph>
    <Paragraph position="3"> EIn Entuurf, der Bur Bins Sprache zur Speziflkation van Cemputerharduaro abgoatiwmt wird. Die Falietudie bel dar Simulation van Mlkro=Prozesaoren van BIt-SIIce-Typ auf dar Ebone dec RegiaterObartragung.</Paragraph>
    <Paragraph position="4"> =*MORE**  III.t From Japanese to German via flTLflS/II and SEflSYN 1. SEMSYN - a Japanese/German translation system  The project SEMSYN-83 - SEMSYN is an acronym for SEM antic SYNthesis - has produced a system for the generation of German from semantic representations. The combination of this generator with the ATLAS/g-System of the Japanese cooperation partner FUJITSU may be seen as the first Japanese to German translation system.</Paragraph>
    <Paragraph position="5"> Die Soainflussung der Zuverl~issigkeit und dot Quel ~t~t van Software mit ainsm Sgstem zur UntarstOtzung dar Entuicklung air ulna= Computer. 4~29,B6 09=57=5Z The analysis of the Japanese input - currently at most titles of scientific papers from the field of information technology - and its transformation into the semantic representation is the task of ATLAS/II. SEMSYN's part is to produce a correct and understandable German text for these semantic representations.</Paragraph>
    <Paragraph position="6"> 2. The overall design of the SEMSYN-Svstem SEMSYN's generation from, FUJITSU's nets to German surface structures is done in three main steps.</Paragraph>
    <Paragraph position="7"> The first step is to transform the semantic net delivered by FUJITSU into an expression of our own frame representation language - the so called IKBS-descriptions. IKBS stands for I '((DEVELOP --INST-&gt; COflPUTER) (SUPPORT --OBJ-&gt; DEVELOP) (SUPPORT --INST -&gt; SVSTEfl) (GIVE --INST ~&gt; SYSTEM) (QUALITY --POSSESSOR-&gt; SOFTWARE) (RELIflBILITV --ENUfl-&gt; QUALITY) (GIVE --GOAL-&gt; RELIABILITV) (GIVE --OBJ-&gt; FlFFECT) (*NIL --ST -&gt; RFFECT)) Ill.: SEMSVN's interface with OTLflS/II (TIT-gi) Instantiated Knowledge Base Schemata. This transformation does not only lead to a more structured representation, it helps as well to keep the generation modul somewhat independent from the special form of the FUJITSU interface.</Paragraph>
    <Paragraph position="8">  The second -, and probably most important - step is to decide in which way the content of the semantic representation should be uttered as German text. The output of this step is a functional description of the intended utterance in grammatical terms ORS = Instantiated Realization Schema).</Paragraph>
    <Paragraph position="9"> The IRS description completely determines the German output. Its terminal elements are root forms of German words and their syntactic features.</Paragraph>
    <Paragraph position="10">  The third step - the generator-front-end SUTRA-S -- takes the IRS description and produces a corresponding syntactically and morphologically correct German surface structure (Emele &amp; Momma, 1985). SUTRA-S is an extended reimplementation of the program SUTRA that has been developped by Busemann in the HAM-ANS project (Busemann, 1982).</Paragraph>
    <Paragraph position="11">  3. Generation from frame descriptions 3,1 The frame description language  The formal definition of SEMSYN's frame representation is as follows:  Conceptually we distinguish the following three main classes  of frames: 1. Case schemata for verb concepts or actions (among these are all those frames that have case roles as slots).</Paragraph>
    <Paragraph position="12"> 2. Concept schemata for noun concepts or &amp;quot;picture producers&amp;quot;.</Paragraph>
    <Paragraph position="13"> 3. Relation schemata - ENUMERATION, PURPOSE-, SCOPE-Relation etc.</Paragraph>
    <Paragraph position="15"> Ill.! Frome-Ooscr!pt!on For TIT-BZ ___ Within this scope the repertoire of the semantic represen  tat)on includes: - &amp;quot;classical&amp;quot; case roles a la Fillmore (agent, object, method, instrument source, goal .... ) - roles for the further specification of actions (manner, place, time ...) - roles for the further specification of concepts (name, concern, specialize ...) - ways to quantify and attribute concepts - modality (e.g. not, possible ...).</Paragraph>
    <Paragraph position="16"> - conjunctive and disjunctive ENUMERATION.</Paragraph>
    <Section position="1" start_page="652" end_page="652" type="sub_section">
      <SectionTitle>
3.2 Knowledge bases during generation
</SectionTitle>
      <Paragraph position="0"> SEMSYN's main generation phase may be viewed as communication between two knowledge bases: General knowledge about principal possibilities for realizing the semantic structures - the so called realization schemata - and specific knowledge mainly about diverse possibilities for lexicalization of semantic svmbols. The latter is stored within the semantic to German dictionary SLEX (ROsner, 1986).</Paragraph>
    </Section>
    <Section position="2" start_page="652" end_page="652" type="sub_section">
      <SectionTitle>
3.3 Object-oriented implementation
</SectionTitle>
      <Paragraph position="0"> The general knowledge about possible realizations has been implemented using the FLAVOR system of the LISP machine.</Paragraph>
      <Paragraph position="1"> The classes of tile frame representation correspond to flavor classes. Realization schemata and the knowledge about the realization of roles are defined as flavor methods. This object-oriented architecture has shown to be very flexible. It supported experimenting with the system and its step-by-step improvement</Paragraph>
    </Section>
    <Section position="3" start_page="652" end_page="653" type="sub_section">
      <SectionTitle>
3.4 Realization schemata
</SectionTitle>
      <Paragraph position="0"> Frame descriptions as used in SEMSYN are recurs)re structures and so is - in general - the control structure in SEMSYN's generation, In other words: the same decisions have to be redone on each level of embedding, tn embedded frames of course some decisions are already restricted by the context.</Paragraph>
      <Paragraph position="1"> What will be the syntactic form of the text generated for such a frame? At least for case schemata we have as first alternative the choice between the realization types :CLAUSE and :NG (noun group). For semantic structures from titles we used as default to generate a noun group (a toplevel case schema was lexicalised as noun). Only in a few cases we had titles that had to be generated as questions like &amp;quot;What is a model of ...?'.</Paragraph>
      <Paragraph position="2">  If the general syntactic form has been decided upon, there are more choices: a clause for example could be realized as an active or a passive clause. Within a noun group the attribute could be realized as a relative clause or in the form of a prepositional group.</Paragraph>
      <Paragraph position="3">  These decisions are done with respect to several factors. One is the type of the actually filled roles. If a case schema for example has an :OBJECT, but no :AGENT, we prefer the passive construction in a clause realization. On the other hand stylistic preferences could be another factor. In the above case a preference could be to avoid passive, so we would take the realization schema &amp;quot;ACTIVE with an anonymus agent of &amp;quot;man'&amp;quot;.</Paragraph>
      <Paragraph position="4"> In titles these preferences come from global switches. In real text they could come from the context.</Paragraph>
    </Section>
    <Section position="4" start_page="653" end_page="653" type="sub_section">
      <SectionTitle>
3.5 Role realizations
</SectionTitle>
      <Paragraph position="0"> For frames without roles - the so called terminal structures - the realization is more or less the lexicalisation of the semantic symbol. After this, process control and the produced IRS structure is given back to the surrounding frame or the toplevel.</Paragraph>
      <Paragraph position="1"> If there are roles, there is some more work to be done.</Paragraph>
      <Paragraph position="2"> Some fillers of roles are realized as distinct structures of their own (mostlv noun groups). They could be uttered for themselves.</Paragraph>
      <Paragraph position="3"> Other roles only lead to changes in the IRS structure of their frame: -decision about syntactic features: fillers of a :NUMBER role may e.g. lead to the pluralization of the noun group of the modified frame.</Paragraph>
      <Paragraph position="4"> -creation of noun compounds as head of the actual nominal group: the filler of a :NAME role may become a prefix (&amp;quot;alas SEMSYN-Projekt'). This holds as well for the terminal filler of a :SPECIALIZE role (variant: realization as an adjective). A negative :MODALITY could - in a noun group realization - lead to the prefix &amp;quot;Nicht-'.</Paragraph>
      <Paragraph position="5"> For those frames that have roles with realizations of their own this procedure recursively repeats for the frame descrip-tions of the fillers of those slots.</Paragraph>
      <Paragraph position="6"> For realized role fillers it has to be decided how their IRSstructure shall be integrated in the overall structure (mostly as prepositional group) and which syntactic features could additionally be inferred.</Paragraph>
    </Section>
  </Section>
  <Section position="2" start_page="653" end_page="653" type="metho">
    <SectionTitle>
4. Inferring of missing information
</SectionTitle>
    <Paragraph position="0"> SEMSYN's generation modul starts from a semantic representation that was designed to be language independent. For the primitives used - especially for the semantic relations expressed by the arcs in the semantic net - this may be true.</Paragraph>
    <Paragraph position="1"> On the other hand the data delivered to us bv FUJITSU are not really universal representations. The fact that the semantic nets are derived from Japanese is recognizable if one looks at the information that is not explicitly represented.</Paragraph>
    <Paragraph position="2"> In Japanese number or definiteness of nouns or time of verbs normally is not expressed - correspondingly our data do not have semantic correlates for these features (except in the rare case when they have been expressed in the Japanese original). The Japanese reader infers the missing information from the context. In titles there is no such context available.</Paragraph>
    <Paragraph position="3"> For correct and acceptable German on the other hand we need determiners and our nouns need a number. Therefore we had to develop heuristics to reconstruct this information.</Paragraph>
    <Paragraph position="4"> Some examples of such heuristics: - a nominalized case frame has to be realized with definite article in singular (&amp;quot;Die Generierun9 natQrlicher Sprache&amp;quot;). - the :OBJECT role of a nominalized case frame should be realized indefinite and plural (&amp;quot;Die Generierung van Titelo'), except in cases with an exception information in SLEX (&amp;quot;Die Wartung van Software&amp;quot;).</Paragraph>
    <Paragraph position="5"> - concepts that have a :NAME role will be realized definite and singular (&amp;quot;pie Fourier-Transformation&amp;quot;).</Paragraph>
    <Paragraph position="6"> If no heuristic is applicable and if no SLEX information is found we use as title defaults 'indefinite' and 'singular' (&amp;quot;~=n Verfahren').</Paragraph>
  </Section>
  <Section position="3" start_page="653" end_page="653" type="metho">
    <SectionTitle>
5. Concluding remarks
</SectionTitle>
    <Paragraph position="0"> Our current concern is to broaden the applicability of SEMSYN's generator for German: On the one hand we are experimenting with the generation of full texts (e.g, newspaper stories), on the other hand we are extending the repertoire of feasible semantic structures that mav serve as input for the generator.</Paragraph>
  </Section>
  <Section position="4" start_page="653" end_page="653" type="metho">
    <SectionTitle>
Acknowledgement:
</SectionTitle>
    <Paragraph position="0"> SEMSYN-83 has been funded by the West German Ministry for Research and Technology (BMFT) from July 1983 till February 1986. Special thanks to all the colleagues that collaborated - for shorter or longer periods - within this project: Kenji Hanakata, Joachim Laubsch, Arek Lesniewki and Shoichi Yokovama.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML