XML Viewer - c92-2078

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-2078_metho.xml
Size: 29,470 bytes
Last Modified: 2025-10-06 14:12:53
<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-2078">
  <Title>A REUSABLE LEXICAL DATABASE TOOL FOR MACltlNE TRANSLATION</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
A REUSABLE LEXICAL DATABASE TOOL FOR MACltlNE TRANSLATION
BRI(HTTE BL,~SER ULRIKF, SCHWALL
IBM Germany IBM Germany
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
O. ABSTRACT
</SectionTitle>
    <Paragraph position="0"> This paper describes the lexical database tool LOLA (Linguistic-Oriented Lexical database Approach) which has been developed fur the construction and maintenance of lexicons for the maclfine translation system I,MT. First, the requirements such a tool should meet are discussed, then I,MT and the lexical information it requires, and some issues concerning vocabulary acquisition are presented.</Paragraph>
    <Paragraph position="1"> Afterwards the architecture aml the components of the I,OLA system are described and it is shown how we tried to meet the requirements worked out em'hier. Although I,OI,A originally has been designed and implemented for the German-English I,MT prototype, it aimed from the beginning at a representation of lexical data that can be reused for uther LMT or MT prototypes or even other NLP applications. A special point of discussion will therefore be the adaptability of the tool and its cumponents as well as the reusability of the lexical data stored in the database for the lexicon development for I,MT or for other applications.</Paragraph>
    <Paragraph position="2"> i. Introduction The availability of large-scale lexical information has widely been recognized as a bottleneck in the construction of Natural Language Processing (NLI') systems. The lexical database I,OLA has been developed in connection with the Logicprogramming-based Machine Translation (LMT) system and shall be presented here. This work is part of the objectives of the project Transl,exis launched in 1991 at the Institute of Knowledge Based Systems of the IBM Germany Scientific Center. Transl,exis aims at the theoretically and empirically well motivated lexical description and the management of the lexical information of LMT in a database. It is conceived as a first step towards a reusable lexical knowledge base.</Paragraph>
    <Paragraph position="3"> 1.1. Requirements for convenient construction and maintenance of Lexicons Based on our experience and existing literature, a tool for the construction and maintenance of large NLP lexicons with a complex entry structure should meet the following requirements: tJ Adequate expressive power of the representation formalism: the expressive power must be sufficient to cover the facts of lexical description. null</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ANGELIKA STORRER
</SectionTitle>
    <Paragraph position="0"> storrer at arbuckle.sns.</Paragraph>
    <Paragraph position="1"> ncuphilologie.uni-tuebingen.de t~ Methodology for the description of lexical information: criteria and guidelines relevant for encoding should be developed and documented. null n Orientation towards lexicographic procedure: the design of the tool should take the logical course of the lexicographie work procedure into consideration and support it during all its steps and phases. The lexicographer should be enabled to concentrate on the lexicographic description of lexical units while the tool itself automatically takes care of the remaining tasks in lexicon development.</Paragraph>
    <Paragraph position="2"> n Consistency mid integrity checking of the lexical data: when entries are added or updated, the system should reject invalid values for particular features and check if the input leads to inconsistency of the database.</Paragraph>
    <Paragraph position="3"> t~ Data independence: An extreme dependency between the structuring of lexical data stored in the database and the structure of the lexical entries in a given application system should be avoided. In tiffs way the lexical data will remain resistant to modifications in the NLP/MT-systems that make use of these data.</Paragraph>
    <Paragraph position="4"> t~ Reusability/Rcversability of the data (cf.</Paragraph>
    <Paragraph position="5"> Calzolari 1989, tlcid 1991): lexical data should be represented in such a way that it can -- apart from its transfer specific components - be re-used for other MT-prototypes with the same source or target languagc, or with the reverse language pair (e.g. German-English and English-German). Ideally, the lexical data should be independent to such a degree that they are also reusable for other NLP-applications.</Paragraph>
    <Paragraph position="6"> D Multi-user access: it should be possible for several users to work on the lexicon simultaneously. null D llelp facilities: the criteria and guidelines for lexical description should be easily accessible. The availability of monolingnal and bilingual dictionaries are to support the lexicographer's linguistic competence.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.2 LMT
</SectionTitle>
      <Paragraph position="0"> LMT, developed by Michael MeCord, is in basic design a source-based transfer system in which the source analysis is done with Slot Grammar (cf.</Paragraph>
      <Paragraph position="1"> McCord 1989, 1990, forthcoming). Two main characteristics of LMT should be emphas~ed:  1. the lexicalism, arising from Slot Grammar source analysis; ACRES DE COLINO-92. NANTES, 23-28 AOt'rr 1992 5 1 0 PROC. OF COLING-92, NANTES, AUG. 23-98, 1992 2. a large language-general X4o-Y-translation  shell.</Paragraph>
      <Paragraph position="2"> Both features facilitate the development of prototypes for ncw hmguage pairs I , Versions of I,MT (in wu'ious stages) exist currently for nine language pairs.</Paragraph>
      <Paragraph position="3"> I,MT currently requires the lollowing types of in-formation to bc specified k)r lexical units (I,lJ):</Paragraph>
      <Paragraph position="5"> o the valency, i.e. the li'anlc of optiomd/ohligatory complement slots; o the specification of the fillers (Nl)s, suhordinate clauses) for each slot; t~ semantic compatibility constraints and collocations; n characterization of mulliword lexmncs;</Paragraph>
      <Paragraph position="7"> tq lexieal transtbmmtions.</Paragraph>
      <Paragraph position="8"> In McCord (forthcoming), an external lexical tormat (l';lA:) is prescntcd wtfich allows the representation of the above information. 1Jntil now, however, the lexical data has been kept in sequential files attd updating tins been done with a text editor. Thtls most of tile above-mentiuncd requirenlents eoukl not be met.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.3 Vocabulary Acquisition
</SectionTitle>
      <Paragraph position="0"> The hand-coding of dictionaries is a laborious and timc-eonsuming task. Thercforc a nmnbcr of attempts have been made to exploit corpora and/or machine readable dictionaries (MRDs) for the build-up of NI A)-lexicons (el. 3.5) 2 . In many cases, however, the lexical information in MRD's is ncithor complete nor sufficiently explicit for NLP/MT purposes anti has to he rcvised hy lexicographers.</Paragraph>
      <Paragraph position="1"> Ideally, the demands on a lexicograptter shoukl only bc of linguistic nature. For this reason a sophisticated tool is necdcd to guklc anti suppnrt the NlJ'/MT-lexicographer in revising entries automaticaHy converted from machine readable sources as well as in buikling up new vocabulary.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. LOLA - architecture and components
</SectionTitle>
    <Paragraph position="0"> The lexical data base tool I,OLA aims at meeting the above mentkmed requirements. Its design and development are based on work achieved in the l,EX-project and the COIJ';X-projcct 3 . LOLA makes use of automatic consistency and hltegrity checks as well as of the support of multi-user access provided as standard facilities by the relational I)BMS SQL/I)S. Updates are made with the help of a user interface tltat supports tim lexicographer during the encoding process. The representation of the lexical data has been worked out to be as independent as possible of the format of a specific applicatiun lexicon, thus increasing the degree of reusability of the lexical data. In additiun, a catalogue of criteria and guiddhms for lexical description is being elaborated and will be integrated into tim</Paragraph>
    <Paragraph position="2"> 1,OI,A system arc tile following (of. l:igure 1): 1. LO1,A-I)B: the database itself.</Paragraph>
    <Paragraph position="3"> 2. COI,OI,A (COder's interface to LO1,A): Interface tilt hand-codiug and modificatkm of tim lexical data, stored in I,OLA-I)B.</Paragraph>
    <Paragraph position="4"> 3. I)B TO LMT: program tll~d generates I,MT lexicon matfies from the lcxieal data stored in 1 I)I,A-I)B.</Paragraph>
    <Paragraph position="5"> 4. I,MT TO 1)11: program that loads already existing 1 ,MT lexicons into I,OI,A-I)B.</Paragraph>
    <Paragraph position="6"> 5, I,I)11 TO DB: program that converts data  from ~M l(l)'s into I ,O 1 ,A-1)B.</Paragraph>
    <Paragraph position="7"> In the tbllowing we give a brief description of these components.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1. The Database
</SectionTitle>
      <Paragraph position="0"> The database was desi~md iu two steps: developmerit of the conceptual scheme and dcvclopment of tim database scheme.</Paragraph>
      <Paragraph position="1"> In the conceptual design phase, the lexical objects, ttmir properties, and their interrelations were represented in an entity-relatinnship diagram (of.</Paragraph>
      <Paragraph position="2"> I,MT is file technical basis of an international project at IBM wifll cooperation between IBM Researdl, tile IBM Science Centers in lleidelberg, Madrid, Paris, llaifa, and Cairo, and IBM European l.anguage Services in Copenhagen (cf. Rimon et al. 1991).</Paragraph>
      <Paragraph position="3"> 2 Cf. Byrd et al. 1987; for an overview of related activities within file l,MT-project, cL Rimon et al. 1991, pp. 14-15. Cf. Barnett et al. 1986; Blumenthal et al. 1988; Storrer 1990.</Paragraph>
      <Paragraph position="4"> Ac~s DE COLING-92, NANTES, 23-28 ao(Tr 1992 5 1 l PROe. OF COLING-92, NANTES, AUG. 23-28, 1992 Chen 1976). Although the ER-model does not have the expressive power to cover all aspects of lcxical description, especially complex constraints, it has been chosen here as a compromise between a cmnplete lexical representation and the realization in a traditional database system.</Paragraph>
      <Paragraph position="5"> The resulting ER-diagram for the German-English lexicon is shown in Figure 2 a .</Paragraph>
      <Paragraph position="6"> The conceptual scheme is still independent of the choice of a specific DBMS and of other implementation aspects. The basic principles of the conceptual design of our database will be sketched out in the following.</Paragraph>
      <Paragraph position="7"> Orientation towards linguistic structure, not towards the structure of the application lexicon.</Paragraph>
      <Paragraph position="8"> The diagram reflects, in the first place, the structure of the linguistic objects, their properties and and their interrelations, and it is influenced to a smaller degree by the structure of the application lexicon.</Paragraph>
      <Paragraph position="9"> As a consequence, the data is quite resistant to structural changes hi the format of the application lexicon. The abstraction from the structures of the application lexicon has a positive side effect with regard to the exploitation of machine readable lexicai resources: on one hand, we can handle cases, in which not all information required by LMT is provided in the entries of MRD's. The information acquired can be stored as entries to be completed and revised later. On the other hand, we are free to store types of lexical information that are of relevance for NLP applications and can be acquired from MRD's or other NLP lexicons but are not processed in a current LMT-version. We can save them in the database as coding aids for the lexicographers, for future prototype versions, or other NLP applications.</Paragraph>
      <Paragraph position="10"> Analogous structure for source and target language wherever possible.</Paragraph>
      <Paragraph position="11"> The lower part of the ER-diagram represents the German source, tile upper part tile English target language. For both languages, an entity of the type entry can have one or more homonyms, each of which can have one or more senses. The senses themselves can open one or more sense-specific slots (one-to-many relations). A sense-specific slot can be filled by several types of fillers and the same type of filler can flU several sense-specific slots (manyto-many relation). The basic types of entities and relations, which are the same for all languages, are described by their characteristic features represented as attributes. The number of attributes as well as their values may differ according to language-specific peculiarities s .</Paragraph>
      <Paragraph position="12"> Many-to-many relations between the lexical objects of both languages.</Paragraph>
      <Paragraph position="13"> We represent the relation of lexical equivalence between source and target senses, as a many-to-many relation (one source sense can have multiple target equivalents and vice versa). This breaks with the traditional hierarchical entry structure of bilingual dictionaries (Calzolari et at. 1990), but it avoids redundant description and storage of one target sense that is lexically equivalent to different source senses. Another relation holds for the sense-specific slots of two senses that are regarded as lexicMly equivalent. We decided to establish this relation between slots and not between slot frames. This way we caaa elegantly describe lexically equivalent senses with non-corresponding slotframes ~ . In this way the relations between the two languages may be used to a great extent bidirectionally for the XY- as well as for tbe YX-language pair.</Paragraph>
      <Paragraph position="14"> The conceptual scheme captured in the ER-diagram was then mapped into a database scheme and implemented in tile relational DBMS SQL/DS. We chose a relational DBMS, because -- for the maintenance of the large LMT-GE lexicon (about 50,000 entries) -- we were in need of a stable DBMS which supports multi-user access, has facilities for automatic checking of consistency and integrity of the lexical data mad allows for the specification of multiple user-specific views on the data. To avoid redundancy and update anomalies we tried to normalize our relations as far it was useful with respect to our approach. In total, 32 tables are imple- null lexical knowledge&amp;quot;, e.g. the admitted values for attributes sucia as semantic type, filler-type, slot-type for both languages.</Paragraph>
      <Paragraph position="15"> 2.2. COLOLA: the user interface to LOLA COLOLA is the user interface to LOLA-DB that looks up the lexical data of a given search word and displays it on sequentially connected menus. The design of the menus as well as their sequential order was guided by the manner in which lexicographers describe lexical entries. The following operations can be performed:</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 The boxes represent types of entities, the diamonds represent types of relations between entities, the ellipses represent
</SectionTitle>
    <Paragraph position="0"> attributes which characterize types of entities or relations. The labels of the connection lines indicate whether the relation in question is a one-to-one, one-to-many, many-to-one or many-to-many relation.</Paragraph>
    <Paragraph position="1"> The ER-diagram is a simplified version of the actual conceptual model. For the purpose of this paper, several entity types, attributes, and relations have been leR out.</Paragraph>
    <Paragraph position="2"> s E.g.: in German a preposition like &amp;quot;auf' can govern either an accusative NP (&amp;quot;warren auf') or a dative NP (&amp;quot;lasten auf') depending on the verb that lakes the prepositional phrase with the respective preposition as a complement.</Paragraph>
    <Paragraph position="3"> Therefore &amp;quot;case&amp;quot; is a feature, relevant for the description of German slot fillers filling a prepositional complement slot, s E.g. cases like &amp;quot;like&amp;quot; and &amp;quot;gefallen&amp;quot; where the subject of the English verb corresponds to the dative object of the German verb; or cases like &amp;quot;geigen&amp;quot; and &amp;quot;play the violin&amp;quot; where the English direct object filler &amp;quot;the violin&amp;quot; is incorporated in the semantics of the German verb &amp;quot;geigen&amp;quot;.</Paragraph>
    <Paragraph position="4">  n addition of new source or target entries n deletion of existing entries n change of existing or addition of new features to existing entries \[\] deletion of features from existing entries u assigmnent of new or deletitm of existing transit(inn equivalents for a given source sense N update, insertinn or deletion of transfer infnrmation for each pair of translation equivalents.</Paragraph>
    <Paragraph position="5"> For each part of speech, a specific sequence of menus is defined. There are ntcnus for homonyms and senses of source and target entries; the qinkiug&amp;quot; of the source senses and target senses regarded to be lexically equivalent is clone via transfer menus.</Paragraph>
    <Paragraph position="6"> This allows lexicographers to spcci~dize on specific parts of speech or on specific fi~atures which can be locally updated.</Paragraph>
    <Paragraph position="7"> COI,OLA controls multi-user access to the LOI,A-I)B so tltat several lexicographers (:an update the lcxical database simultaneously. The logical unit of work is the source or target homonym: when a lcxicngraphcr rcqucsts to update a homonynt, this homonym, together with its senses, is locked tar other users.</Paragraph>
    <Paragraph position="8"> If a new entry contains blanks, a multiword menu is called where the multiword is split up into its components. For each component, the following lexical information is gathered: the part of speech, whether the word may inflect within the multiword, and whether a phrase can be inserted between one multiword component mKl tire previous one without doing away with idiomaticity.</Paragraph>
    <Paragraph position="9"> If new hotnonylus or 8ellses ;:ire inserted ou the multiword menu as well as on other menus, default wducs for tcaturcs arc displayed. They can either be accepted or rejected and overwrittcn by the lcxiengraphcr 7 . The assumptions on dethult values for attributes nf lcxical intbrmation may differ according to diffcrcnt grammars and systems. We therefore decided to store the complete lexJcal infbrmatinn and use dcfault wdues as proposals in tire user interface. With this approach we aUow for two advantages: on one hand, the data in thc database can be used lot difllsreut appficatinrts having distinct theory specific assumptions on defaults. On the other hand, the user of CO1,OI,A can bcuetit li'om the economic advantages ()f default assumptions.</Paragraph>
    <Paragraph position="10"> C()I~OLA does extensive crmsistency chccking of the values entered by tire lexicographers. Illegal values arc rejected and warning messages are displayed in situations where errors nfight easily occur. Although much of tim consistency cttecking is sup..</Paragraph>
    <Paragraph position="11"> pnrted by the database management system, some extensions were necessary.</Paragraph>
    <Paragraph position="12"> Further support for the lexicographers is provided by an interface tn the WordSmith nn-linc dictit)nary system (cf. Byrd/Neff 1987). Several machine readable dictionaries are available e.g. Collins Germantinglish, F, nglish-German, Longman's I)ictionai T of Contemporary l':uglish, and Webster 7th Collegiate dictionary. The lexicographer can look up entries in these dictionaries during the encoding process.</Paragraph>
    <Paragraph position="13"> Furthermore, help menus are provided in which ttre vMid values tbr specific features can be looked tip.</Paragraph>
    <Paragraph position="14"> 7 Default values are provided, for instance, for slot fillers. German direct objcct slots get an accusative noun phrase as lhe default filler. The lexicographers may accept this, add other fillers or write over it with ann(fief filler. AC1T~ BE COLING-92, NAN'fES, 23-28 AOUf 1992 5 1 3 I)ROC. OF COLING-92, NANTES, AUO. 23-28, 1992 2.3. DB 7&amp;quot;0 LMT A conversion program I)B TO LMT has been dcveloped which extracts lexical in-formation stored in the relations of I,OI,A-DB and converts it into l,MT-lbrmat. 1)11 fl'O _I,MT consists of two components: null u a datalmsc extractor and O a conversion program The database extractor selects the source entries and the corresponding target entries and stores them in database format. This format can be regarded as an intermediate representation betwccn database scheme and I,MT-format. It consists nf a set of Proh)g predicates which correspond to the relations of thc database scheme. Thcrc are, for instance, entry, homonym, sense, and slot predicates which correspond to the entry, homonym, sense, and slot relations in the database. The conversion program finally converts the database format into the LMT-format. It has to be adapted according to the changes or extcnsions of&amp;quot; the l,MT-ff~rmat.</Paragraph>
    <Paragraph position="15"> 2.4. LMT TO DB Before and during LOLA design and development, I,MT lexicons in EI, F were already crcated and updated in fdes. Since these lexicons still need updating and since this is much better supported by 1,OLA, a conversion program LMT TO DB was needed which converts EI, I r entries c~f lextcon files into the database format and loads them into I,OI,A-DB. I MT_TO_I)I~ consists of three cornponents: null t~ the lexiemt cmnpiler of LMT, tJ a conversion component, and \[\] a database loader.</Paragraph>
    <Paragraph position="16"> The lexicon compiler is the component of the I,M'I&amp;quot; system which converts the EI,F into tire internal I,MT format s . In the internal format all abbreviation conventions and default assumptions are already interpreted and expanded accordingly so that the complete lexical itfformation is represented explicitly. The conversion component then converts the internal l,MT-format to database forrnat. The database loader generates the SQL-statements and updates the database. It has to check first whether the hmnonym or sense to be inserted is identical with ,an homonym or sense stored in the database.</Paragraph>
    <Paragraph position="17"> If all the features of two homonyms or senses can be unified, they are regarded to be identical and the already existing entry is merged with the converted entry. In all other cases the homonym or sense is inserted into the database and merging has to be done by the lexicographers with COI,O1,A.</Paragraph>
    <Paragraph position="18"> 2.5. LDB TO DB To supplement the lexical coverage of the LMT system, a dictionary access module has been developcd which allows real-time access (cf.</Paragraph>
    <Paragraph position="19"> Neff/McCord 1990) to Collins bilingual dictionaries awfilable as lexical data bases (l,l)Bs) 9 . The module includes a language pair independent shell component COl ,1 ,XY and laoguage-speeific components ,'rod converts the lexical data of the Ll)ll into the l,Ml'-format, l,Dll TO l)ll is based on these programs. It consists of \[\] a pattern matching component, tJ a restroettrring enmlmnent , a corrvel'sion eonlllofierrt , and t~ the database loader of 1,MT TO I)B.</Paragraph>
    <Paragraph position="20"> With the pattern matclfing cmnlmnent, those features (sub-trees) that are to be converted are selected from the dictionary entries. In printed dictionaries, features colnnlon tO more th,'m one sub-tree are of.</Paragraph>
    <Paragraph position="21"> ten factorcd out in order to save space. With the re,structuring component, those features can be moved 1o the sub-trees they logically belong to.</Paragraph>
    <Paragraph position="22"> The conversion ennlponent converts the restructured dictionary entry to database format. The database loader of LMT TO I)B merges the entry with a possibly already exiting one in I,OI,A-DB anti generates the SQl,-staternents to update the database. The converted entries can be revised by the lexicographers with COI,OI,A.</Paragraph>
    <Paragraph position="23">  3. Reusability of the LOLA system</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Reusability of the tool components
</SectionTitle>
      <Paragraph position="0"> The first I,OI,A prototype was tleveloped to support lexicon development for the language pair German-English. In the meantime, work has been started to make the tool usable for lexicon developrnent of the Fnglish-I)anish and English-Spanish I,MT systerns. As a positive result of the design principles described in section 3.1., the database scheme had to be modified only slightly with regard to prototype-specific differences Ideg . The values for&amp;quot; 1,'mguage-specific attributes such as types of slots mad fillers will be defined for&amp;quot; the &amp;quot;new&amp;quot; languages Spanish and Danish and will be stored in the database. Thcy can then bc used fi~r consistency checking (only defined valucs can be updated in the database). In COI,OI,A we had to take into account the homonym level on thc target side, where s In tile morpho-lexical processing and compiling phase, I~,I.F entries are converted into an internal format (cf. McCord (forthcoming): sect. 2) which represents file initial source and transfer analysis of an individual input word string. 9 An LI)II provides a tree representation of tile hierarchical structure of tile dictionary entries. The nodes of tile tree are labeled with attributes having specific values for each individual entry. The t,DB can be queried wilt the specialized query language LQL (cf. Neff/Byrd/Rizk 1988).</Paragraph>
      <Paragraph position="1"> ~o English-Daoish and English-Spanish use lexicon driven morphology for the target languages Spanish (cf. Rimon et at. 1991) and Danish, whereas German-Engllsh uses a rule-based target morphology for English (el', McCord/Wolff 1988).</Paragraph>
      <Paragraph position="2"> ACRES DE COL\[NG-92, NANTES, 23-28 AUra&amp;quot; 1992 5 1 4 PROC. OF COLING-92, NA.'CrES, Auo. 23-28, 1992 the features of Spanish ,anti l)anish morphology have to be specitied. The programs that convert the database entries into the timnat of the application lexicons and vice versa (\])llfl'O I,MT mul I,MT_TO DIt) need generalization \]il nrdei to achieve an abstraction from prototype-specific lealures of \] ,MT.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Reusability of the lexical data
</SectionTitle>
      <Paragraph position="0"> In order to meet the requirement of data independence, the representation of lexical entries in the database is highly independent of that in the application lexicon. In the database, the description of linguistic entities and their interrelatkms is given in a set of tables where specitic values air stored lor the characteristic attributes of each individual entity.</Paragraph>
      <Paragraph position="1"> On these tables, different views can be defined for different types of users, l)iffercnt programs (like I)B TO I,MT) can extract exactly the attribute values needed fur their respective appficatiou and convert them into each given format. This way, from one and the same data base several lexicons can be generated, in which the same &amp;quot;lhaguistic world' is structured differently or represented in a completely ditti~rent way. The possibilities of reusability ,are naturally defined and limited by thc number of the registered types of lexical information in the origin',d data base. As far as the LOI,A database is concerned, the very detailed description of slot frames as well as the information about nmltiwords and the properties of their emnponents may be reused for other NLP applicatkms with one of the languages inwflved. The reusability of the transter intormation (specitied in the transfer relations between the languages of a given language pair) for other MT systems depends highly on the respective MT approach. As to the question of rcusabilty of the data in the LMT system &amp;quot;family&amp;quot;, three different cascs have to be distinguished:  guage Y is reused for another language pair having Y as source language.</Paragraph>
      <Paragraph position="2"> In the tirst two cases, reusability of tim lexical data of language X is very high. In tim ttfird case, the description of Y as source 1;mguage may have to be more detailed in order to achieve ml adequate syntactic analysis n . New attributes or even new types of entities or relationships may be needed mad the database scheme will have to be etd~aneed accordingly. null</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML