File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/90/j90-3002_intro.xml
Size: 7,718 bytes
Last Modified: 2025-10-06 14:04:54
<?xml version="1.0" standalone="yes"?> <Paper uid="J90-3002"> <Title>AN EDITOR FOR THE EXPLANATORY AND COMBINATORY DICTIONARY OF CONTEMPORARY FRENCH (DECFC)</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 INTRODUCTION (WHAT IS THE DECFC?) </SectionTitle> <Paragraph position="0"> The Dictionnaire Explicatif et Combinatoire du Fran~ais Contemporain (DECFC) is an attempt to provide a formally complete and adequate description of the French lexicon. It is based on the &quot;Meaning-Text&quot; theory (Mel~uk 1973), which was the source of several projects in natural language processing and especially in automatic translation. One of the most important principles of the DECFC is that the greater part of the information needed to describe a natural language should be compiled within the dictionary. This is in contrast with the current practice of giving preference to grammars.</Paragraph> <Paragraph position="1"> Far from being a modest and secondary appendix to a good grammar, the dictionary becomes the main (in effect, only) basis of all grammars and, in general, of all linguistic descriptions (Mel~uk 1973).</Paragraph> <Paragraph position="2"> As the dictionary is used as the basis of linguistic description, it becomes a very complex database for different types of information with many links and constraints. A well-defined methodology is needed to build such a dictionary; adequate computerized tools are also needed, otherwise the task becomes almost impossible. Our editor is an attempt to provide such a tool.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.1 OVERVIEW OF THE MEANING-TEXT THEORY </SectionTitle> <Paragraph position="0"> To really understand the goal of the DECFC, it is important to put it in the perspective of the Meaning-Text Theory (MTT) for which it is the foundation. We now briefly sketch the MTT and show its implications for the dictionary. A comprehensive presentation can be found in Mel6uk (1973); a more computer science-oriented view is Boyer and Lapalme (1984), where it is used as the basis of a system for generating paraphrases. Mel~uk and Polgu6re (1987) describe the formal approach that underlies the construction of the DECFC.</Paragraph> <Paragraph position="1"> As stated by Mel~uk, the purpose of the MTT &quot;consists in establishing correspondences between any given meaning and (ideally) all synonymous texts having this meaning.&quot; The MTT is essentially descriptive and is not concerned with procedures for moving from meanings to texts and vice versa. In MTT, an utterance u is represented at seven levels: * the Sem(antic)R(epresentation), which is a linguistic object, an utterance in a pictorial language. Its role is to represent a class of synonymous sentences, weeding them out of all their syntactic information. A semantic graph is a connected directed graph with labeled nodes and arcs. The node labels are either predicates or names of objects. The arc labels are integers; the arc labeled by i leads to the ith argument of the predicate.</Paragraph> <Paragraph position="2"> * the D(eep-)Synt(actic)Representation is a tree whose nodes are labeled with &quot;meaningful lexemes&quot; of u. * the S(urface-)Synt(actic)R(epresentation) is also a tree, but its nodes are labeled with all actual lexemes of u.</Paragraph> <Paragraph position="3"> * MTT also introduces the Deep and Surface representations for morphology and phonology.</Paragraph> <Paragraph position="4"> Computational Linguistics Volume 16, Number 3, September 1990 145 Michel D6cary and Guy Lapalme An Editor for the DECFC For example, consider the following simple network, many change 1 -- sale y z 20 which represents the sentence &quot;The sales have increased by 20 units.&quot; This network is incomplete because there is no indication about the time of the action and about the determination of the sales (do we know exactly what sales we are talking about?). To transform a SemR to a DSyntR, we have to cover all the nodes of the SemR with &quot;network schemata&quot; found in the DECFC and merge the corresponding trees also given by the DECFC.</Paragraph> <Paragraph position="5"> Suppose now that our dictionary contains only of three definitions composed of the &quot;network schema,&quot; the corresponding tree and the conditions under which the definition can apply to a network.</Paragraph> <Paragraph position="6"> The first rule indicates that if the y and z arguments are free variables and that w is a positive integer, then the change(x,y,z,w) predicate can be transformed to the tree corresponding to x' increase_by w. x' corresponds here to the x node obtained by using the dictionary definitions. In this case, the sale node gives the node (sale,name,(fem,n)). So applying rules 1), 2), and 3) we obtain the three following trees:</Paragraph> <Paragraph position="8"> Boyer and Lapalme (1984) describe a variant of the classical unification algorithm for merging these trees (where underlines indicate free variables); we get the following</Paragraph> <Paragraph position="10"> We could continue by applying similar rules to transform between the different levels of representation. The transformation rules have to be very precise, and thus the dictionary becomes a complex database involving many relations between words. A formal approach to the dictionary building is needed because the transformations are essentially automatic and data-driven (in our case dictionary-d.riven). Building the DECFC is an enormous task because it has to deal not only with the lexemes but also with their intricate relations. The appendix gives the full ent:ry for the lexeme respect where it can be appreciated that a tool to help in writing and checking entries and relations would be very useful.</Paragraph> <Paragraph position="11"> However, the basic goal of the DECFC project is not to create a version of the dictionary that could be used by a computer program (for text analysis or generation, for instance) but rather to edit a printable version for human readers. This means that the way information is presented and\[ edited is not always as formal as could be expected from the theory. For instance, definitions of lexemes, which are represented by semantic networks in the theory, are represented in the DECFC by French sentences derived from the network by following a set of principles (but no formal rules). Despite this, the DECFC is built applying a systematic methodology that could in principle be programmed, the main difference being that the information is not always as explicit as it could be. A lexicologist could, for instance, retrieve the exact semantic network from a DECFC definition, but a computer could not. Furthermore, the DECFC includes some redundancies that would not be needed from a theoretical point of view. Even if there are a few differences between the DECFC and the formal lexicon of the MTT, there is a clear and direct correspondence between the two.</Paragraph> <Paragraph position="12"> Our discussion emphasizes the validations the system has to ensure and the way to implement them. We first give a description of the DECFC structure. We then concentrate on specific problems of coherence and verification through the dictionary. We finally discuss the way lexicographers can interact with the system through a specialized interface to the editor. Mel6uk and Polgu6re (1987) give more details about the structure of this explanatory and combinatorial dictionary. We only give here what is relevant for our system.</Paragraph> </Section> </Section> class="xml-element"></Paper>