File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/73/c73-1027_metho.xml
Size: 31,416 bytes
Last Modified: 2025-10-06 14:11:04
<?xml version="1.0" standalone="yes"?> <Paper uid="C73-1027"> <Title>AN ENGLISH DICTIONARY FOR COMPUTERIZED SYNTACTIC AND SEMANTIC PROCESSING SYSTEMS</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. LEXICAL INFORMATION </SectionTitle> <Paragraph position="0"> Until recently much of the interesting research in lexicology has been carried out in the Soviet Union. The Soviets have long been concerned with automated language processing and attribute the lack of success at this task to the lack of a sophisticated lexical theory. Dictionaries are quite inadequate in giving Us insight into the nature of words. There is no way, for example, that one could learn a language using a dictionary. In addition, definitions in dictionaries are circular: every word is defined in terms of every other word (actually approximately 50 % of the vocabulary appears in the definitions (JOHN OLNEY, personal communication)).</Paragraph> <Paragraph position="1"> Some of the most innovative research in Soviet lexicology has been carried out by 2olkovsky, Mel'~uk and Apresjan (cf. A.K. 2OLKOVSKY and I. A. MEL'~UX, 1970, YU. D. API~SJAN, 1967, and Yu. D. AP~-SJAN, I. A. MEL'&m and A. D. 2OLKOVSKY, 1969). Initially, they felt that the detailed syntactic properties of a word composed its meaning - in a structural rather than substantive sense. Their approach was first to classify words using grammatical criteria; for example, Apresjan classified verbs as being able to undergo the passive transformation or as being able or unable to take a complementary infinitive, accusative objects or locative adverbial phrases.</Paragraph> <Paragraph position="3"/> </Section> <Section position="4" start_page="0" end_page="20" type="metho"> <SectionTitle> AN ENGLISH DICTIONARY FOR COMPUTERIZED SYNTACTIC 305 </SectionTitle> <Paragraph position="0"> Their theory is essentially a structuralist one. In one of their studies they proposed a revision of the notion &quot;word field &quot;. That is, their lexico-structural analysis begins with an enumeration of the phrase types of a language, revealed by syntactic analysis; an indication of the frequency of each of the structural patterns; and finally an enumeration of the word meanings found in each pattern.</Paragraph> <Paragraph position="1"> In Yu. D. APRESJAN, I. A. MEL'~UK and A. D. 2OLKOVSKY (1969) and elsewhere they propose that a dictionary which displays &quot;the process of text generation as an integral succession of steps&quot; be constructed. They state that the dictionary should be based on the following principle: null ... it must be fully sufficient for a smooth, idiomatic and flexible expression of a given meaning; that is to say, it must display in an explicit and logical form whatever information may be necessary for the correct choice and usage of words and phrases to convey a given idea in a speech context.</Paragraph> <Paragraph position="2"> The proposed dictionary is &quot;combinatory&quot; because &quot;it is primarily intended to d!splay the combinatorial properties of words. &quot; It is &quot;explanatory&quot; because the syntactic government patterns are semantically interpreted with the goal of providing idiomatic expression of any given meaning.</Paragraph> <Paragraph position="3"> The typical entry in their dictionary would have the following format: a) Entry word b) Morphological information c) Definition d) Syntactic potential of word e) Regular lexical functions f) Non-regular lexical functions g) The &quot;lexical universe&quot; of entry 11) Examples i) Phraseology (idiomatic expressions) . j) Discrimination of synonyms and near-synonyms.</Paragraph> <Paragraph position="4"> Concerning the definition (c), they specify that they not be circular. They state that &quot;if this requirement is met, all the definitions will in the long run be reduced to a small number of indefinable units of meaning (elementary meanings). &quot; (This is the same goal as the UCLA lexicography project.) An example of this can be found in McCawley's 306 RAOUL N. SMITH- EDWARD MAXWELL work with lexical atoms (J. MCCAWLEY, 1968). That is, redden has the semantic atoms &quot; cause to come to be red. &quot; It is important to note that the definition of a word should be an exact paraphrase of the word using these semantic atoms.</Paragraph> <Paragraph position="5"> The notion of lexical functions is the principle innovation of their dictionary. Lexical function involves establishing relationships between words. Examples from Yu. D. APRESJAN, I, A. MEL'~UX and A. D. ~OLKOVSHY (1969) are the following: word is &quot;an informal description of a suflqciendy broad piece of reality including the given situation as a constituent element. &quot; For example, the lexical universe of student would include such lexical items as books, classes, college, instructor, study, exam and so on.</Paragraph> <Paragraph position="6"> Finally, description of near synonyms would involve a listing of all words connected to a lexical item by connotations. Connotations involve, of course, literary and emotional overtones of words. A terrorist is, for example, a guerrilla whose cause we have emotional disagreement with.</Paragraph> <Paragraph position="7"> Their notion of syntactic potential corresponds somewhat to Fillmore's case frames. That is, for Fillmore, a dictionary must specify the case potential of words. For example, in the sentences (1) (2) a) John hit the ball with a bat.</Paragraph> <Paragraph position="8"> b) The bat hit the ball.</Paragraph> <Paragraph position="9"> c) John hit the window with the ball.</Paragraph> <Paragraph position="10"> d) *John hit the window with the ball with the bat.</Paragraph> <Paragraph position="11"> e) *The window hit.</Paragraph> <Paragraph position="12"> a) John broke the window with the ball.</Paragraph> <Paragraph position="13"> b) The ball broke the window.</Paragraph> <Paragraph position="14"> c) The window broke.</Paragraph> <Paragraph position="15"> d) The ball broke the window.</Paragraph> <Paragraph position="16"> bodl hit and broke can have agents as subject. Notice also that in the case of hit the object always remains after the verb, but broke allows the object to be the subject. Both verbs allow the instrument to be the subject. And all of this information comes under &quot;syntactic potential. &quot; Fillmore's current (C. J. FILLMORE, 1970) &quot; cases &quot; are agent, experiencer, instrument object, source, goal, place, time, and extent. The syntactic potential of a word (in the sense of AVa~SJAN et al., 1969) determines the case of a lexical item (and the case frame of a verb). For example, in the sentences 308 RAOUL N. SMITH- EDWARD MAXWELL (3) a) *Personally, I'm sixty-five.</Paragraph> <Paragraph position="17"> b) Personally, I'm happy.</Paragraph> <Paragraph position="18"> the reason for the non-bizarreness of (b) is that the subject of be happy must be an &quot;experiencer &quot;. On the other hand the verbal be warm can have an experiencer, object, instrument, place, or time as its subject: (4) a) Algernon is warm.</Paragraph> <Paragraph position="19"> b) The rock is warm.</Paragraph> <Paragraph position="20"> c) The coat is warm.</Paragraph> <Paragraph position="21"> d) Texas is warm.</Paragraph> <Paragraph position="22"> e) Summers are warm.</Paragraph> <Paragraph position="23"> In particular, C. J. FrrtMOm~ (1969, p. 109) feels that the lexicon must make accessible to the user (i) the nature of the deep-structure syntactic environments into which the item may be inserted; (ii) the properties of the item to which the rules of grammar are sensitive; (iii) for an item that can be used as a &quot; predicate &quot;, the number' of &quot; arguments &quot; that it conceptually requires; (iv) the role(s) which each argument plays in the situation which the item, as predicate, can be used to indicate; (v) the presuppositions or &quot;happiness conditions&quot; for the use of the item, the conditions which must be satisfied in order for the item to be used &quot; apdy &quot;; (vi) the nature of the conceptual or morphological relatedness of the item to other items in the lexicon; (vii) its meaning; and (viii) the phonological or orthographic shapes which the item as null sumes under given grammatical conditions.</Paragraph> <Paragraph position="24"> Although dictionaries are the most popular way to define words, there are other ways than dictionaries for specifying the meanings 2 of a word within a certain lexical system. For example, U. WmNP, r~CH ~(1963), in his review of Soviet semantic research, speaks of three ways * of specifying word-meanings: It should be clear that we are not using the term ' me.aning' and ' definition' synonomously.</Paragraph> </Section> <Section position="5" start_page="20" end_page="20" type="metho"> <SectionTitle> AN ENGLISH DICTIONARY FOR COMPUTERIZED SYNTACTIC 309 </SectionTitle> <Paragraph position="0"> 1) by lexicographic definition (like the dictionary); 2) by locating the lexical item in a synonym system; 3) by establishing the syntactic properties of the lexical items. Point (1) has been discussed above. As for point (2) M. MINSKY (1968) has a few interesting comments on the possibility of constructing a thesaurus-like dictionary (which would be, in effect, a synonym dictionary): My thesis is simply that we must not try to evade the' thesaurus problem' just because we (rightly) can never be satisfied with any particular thesaurus. We must still learn how to build them, and find ways to make machines first to use them, then to modify them, and eventually to build for themselves new and better ones (p. 27).</Paragraph> <Paragraph position="1"> There has been much recent research in current linguistic theory with respect to Weinreich's third way of analyzing a terminological system, by syntactic characterization of words. The J. FreEDMAN (1971) computerized lexicon included information about the types of transformations that a word can undergo as well as some rudimentary semantic information (in the form of features). Other information that has not been included in computerized systems to any great extent are such notions as &quot;factivity &quot; (as defined by P. IfivARSKY and C. KIPARSKY, 1970) and notions of &quot;genericity &quot; and &quot;specificity &quot; (as discussed in R. JACK~NDOrF, 1973). Another important syntactic development that has found its way into lexical systems is &quot;case structure&quot; as mentioned earlier and as elaborated in C. J. FILLMO~ (1968, 1969, 1971), R. P.</Paragraph> <Paragraph position="2"> STOCKW~LL et al. (1973). Most of these interesting and important facts of language have not been incorporated into computerized or standard dictionaries.</Paragraph> <Paragraph position="3"> An additional type of information to be included in a lexicon should be the non-discrete syntactic and semantic features proposed by Ross and by Lakoff. Both linguists, working in syntax and semantics, respectively, have discovered variable acceptability of syntactic and semantic features within a given structure. Lakoffproposes to account for this variable strength probabilistically, basing his research on results from the theory of fuzzy sets. In our work on interactive lexicon construction, we have found a variation in responses due, we presumed, to regional, social, psychological and perhaps chronological differences. This probabilistic information, measured in response time, should also be included in a lexicon as information pertinent to utterance understanding and production.</Paragraph> <Paragraph position="4"> 310 RAOUL N. SMITH- EDWARD MAXWELL</Paragraph> </Section> <Section position="6" start_page="20" end_page="20" type="metho"> <SectionTitle> 3. CONTENTS OF THE LEXICON </SectionTitle> <Paragraph position="0"> The purpose of this section is to describe in specific detail what our dictionary will look like and how we plan to incorporate the data discussed in the previous section.</Paragraph> <Paragraph position="1"> First, we propose to tag the following syntactico-semantic information on nouns, verbs, adjectives, and adverbs which we assume to be crucial: for every lexical entry in each part of speech we will record: 1) Entry word.</Paragraph> <Paragraph position="2"> 2) Part of speech.</Paragraph> <Paragraph position="3"> 3) Semantic field.</Paragraph> <Paragraph position="4"> 4) Dictionary definition.</Paragraph> <Paragraph position="5"> 5) Irregular inflectional morphology.</Paragraph> <Paragraph position="6"> 6) Derivational morphology. (Prefixed forms are relatively easily retrievable from the hyphenated form of the word in the dictionary with a table of prefixes. Suffixed forms can be retrieved for productive suffixes by checking the ending against a list of suffmes including the combining forms recorded in Webster's. The purpose of this will in part be to be able to relate lexical entries from the same root.) 7) Synonyms including synonymous cross-references (available from NIH research group) plus annotations from synonym paragraphs in the Webster's Dictionary. Suffixed forms are retrievable in part from run-on entries with notation as to source and target parts of speech.</Paragraph> <Paragraph position="7"> 8) Antonyms when available.</Paragraph> <Paragraph position="8"> (1, 2, 3, 5 and 7 are available direcdy from Webster's: 4 and 6 are available in part from the derived data sets from the Lexicography Project users group.) 9) Example of use for each definition under a traditional main entry available from the Brown English Corpus.</Paragraph> <Paragraph position="9"> 10) Response time for sentences by informant and averaged by sentence over all informants.</Paragraph> <Paragraph position="10"> 11) Informant data, available from informant, including region, class, sex, age, race and economic status.</Paragraph> <Paragraph position="11"> In addition we will record information peculiar to each part of speech: For nouns (t deg be derived from defining formula whenever possible, otherwise interactively and by hand):</Paragraph> </Section> <Section position="7" start_page="20" end_page="20" type="metho"> <SectionTitle> AN ENGLISH DICTIONARY FOR COMPUTERIZED SYNTACTIC 311 1) The following syntactico-semantic features: </SectionTitle> <Paragraph position="0"> 4- human, 4- animate, :k count, 4- concrete, 4- male, 4- female.</Paragraph> <Paragraph position="1"> Also, the following non-binary features which could be treated as a property list or as a set of functions in the&quot; sense of S. MARX (1972) : used as an instrument, indication of quantity or degree, movable, prolonged, separable, color, and shape. These were posited on the basis of the defining formulae in Webster's.</Paragraph> <Paragraph position="2"> 2) Case markings.</Paragraph> <Paragraph position="3"> 3) Metaphorical extension. (We may find that this category, as well as others, are probably derivable from other information, but at the moment it isn't clear and so this information is being listed separately.) null 4) Sociolinguistic restrictions on use of the entry.</Paragraph> <Paragraph position="4"> For verbs: 1) Complementizers.</Paragraph> <Paragraph position="5"> 2) Subcategorization.</Paragraph> <Paragraph position="6"> 3) Defining verb, that is, the verb, if present, used in defining the entry, e.g. be, become, come, have, make, etc. These may be relatable to McCawley's interpretation of kill as &quot;to cause to become not alive, &quot; and to our notion of semantic field discussed below.</Paragraph> <Paragraph position="7"> 4) Selectional features related to noun features such as animate subject. 5) Presuppositions and their differences from synonyms of the entry.</Paragraph> <Paragraph position="8"> 6) Case structure ,(number and type of arguments.) For adverbs: 1) Type: time, manner, location, direction, degree.</Paragraph> <Paragraph position="9"> (Much of this can be gotten from the defining formulae.) 2) Position sensitivity: subject-oriented, speaker-oriented, verboriented, or sentence-oriented.</Paragraph> <Paragraph position="10"> For adjectives: 1) The kind of noun it can or must modify, e.g. animate, concrete, count; and the manner in which it modifies (e.g. warm stove, warm coat) and whether it is a relative term (hot/cold) or absolute (black/ white).</Paragraph> <Paragraph position="11"> 2) Semantic properties/functions: color, time, location, size, and quality. (These are disjunct sets.) 312 RAOUL N. SMITH- EDWARD MAXWELL 2) Part of Speech: verb 3) Dictionary Definition: to touch in order to have a tactile sensation. 4) Irregular Inflectional Morphology: felt 5) Derivational Morphology: feeler 6) Synonyms: touch &quot;7) Antonyms: to be numb 8) Example of Use: John felt the surface of the table. AN ENGLISH DICTIONARY FOR COMPUTERIZED SYNTACTIC 313 9) Response Time for Acceptability of the Sentence: I am feeling the table: 3 seconds, negative response. 10) InformantData: male student, age 24, northeast. 11) Complementizers: none of the regular complementizers can be used with the verb to feel under the above definition. Notice that if the that complementizer isused this indicates a change of definition: John felt that the treatments were too painful.</Paragraph> <Paragraph position="12"> 12) Subcategorization: + Transitive 4- Stative (-- stative when aware of texture) 13) Defining Verb: none (implication: word defines semantic field).</Paragraph> <Paragraph position="13"> 14) Selectional Restrictions: + Human Subject.</Paragraph> <Paragraph position="14"> 15) Presuppositions: Instrument is part of Agent's body. 16) Case Structure: \[A, O, (I)\] v \[E, O, LOC\] t</Paragraph> </Section> <Section position="8" start_page="20" end_page="20" type="metho"> <SectionTitle> 4. METHODOLOGY </SectionTitle> <Paragraph position="0"> The plan for the dictionary is to produce a core English lexicon consisting of the 20,000 most frequent words listed in H. KU~ERA and W. N. FRANCIS (1967). The reason for choosing these is that in theory they account for 98 ~o of the words in running text.</Paragraph> <Paragraph position="1"> As described in section 3 we havea very good idea of what to include in the lexicon, although this must obviously be left open-ended. There are problems of division of labor, however: that is, how can we most efficiently capture the information that we want to include. We have narrowed the various possible ways down to three: 1) by hand (including a real time text editing scheme) 2) interactively 3) by automated processing of a standard dictionary.</Paragraph> <Paragraph position="2"> : Method (1) is obvious. As for method (2) Olney (J. OLNEY, D. RAMSEY, 1973, p. 16) says, &quot; what better source than the disambiguated parsed \[= formatted\] transcripts of W 7 and MPD \[The Merriam Pocket Dictionary, which is also on tape\] is there likely to be in the near future for obtaining semantic data pertaining to the English vocabulary as a whole? &quot;. We feel that there is a better source, at least for the kinds of information that we are interested in, and that is the native speaker of English. lk. N. SMITH (1972) describes a way of obtaining this data 314 RAOUL N. SMITH- EDWARD MAXWELL interactively (in a system which has been described by R. L. WmMANN (1972, p. 9) as &quot; one of the most successful projects currently under way &quot;) and the reader should consult that work for details.</Paragraph> <Paragraph position="3"> As to method (3) we have been influenced by the work of one of the largest groups and one of the most potentially successful groups involved in automating the process of lexicon construction from standard dictionaries, viz., the user's group emanating from the Lexicographic Project headed by JoaN OI.NtY of the Institute of Library Research at the University of California, Los Angeles, in collaboration with Systems Development Corporation. This project began in July 1966 with the initiation of transcribing Merriam-Webster's Seventh Collegiate Dictionary in computer processable form. Since then collaboration with over 30 researchers at various institutions has led to the creation of approximately 50 data sets derived from the dictionary transcript. A few of the data sets have been used in disambiguating the entries in the dictionary - the principal first goal of this philosophically, rather than linguistically, oriented project. Some of this has been relatively successful but based on the scope and the methods used, it is clear that still a great deal more time and effort will have to be expended.</Paragraph> <Paragraph position="4"> Some of the already existent derived data sets are useful. The group at SDC has formatted the original transcript of Webster's Seventh so that the main entry, the etymology, the pronunciation, etc. are all put into a fixed format of card image records where the first character of each record specifies the type of information recorded, e.g. whether the record is one of the words used in the definition of a main entry. All of the subsequent data sets have been derived from this formatted version. One of these is an alphabetized list of the first 86 characters of all definitions separately and by part of speech. In addition all synonomous cross-references have been extracted, alphabetized on the main entry form and on the word referred to. Also, there are various suffixal data sets used in aiding to correlate suffmes with definitions.</Paragraph> <Paragraph position="5"> Samples of print-out for sorted definitions within part of speech and end-alphabetized within part of speech are appended. The former has been especially productive by giving us quite a good deal of insight into so-called defining formulae and these defining formulae have in turn allowed us to posit certain features which can be extracted directly from the definitions. These defining formulae will be used in extracting some of the features from Webster's. (Some features such as&quot; + human&quot; cannot be extracted automatically, except by listing, by the interactive scheme described above or, by some inferential scheme.) We have also AN ENGLISH DICTIONARY FOR COMPUTERIZED SYNTACTIC 315 constructed a xwIc concordance for a portion of the data on non-function words in the definitions which will lead to short-cuts for syntactic-semantic tagging.</Paragraph> </Section> <Section position="9" start_page="20" end_page="20" type="metho"> <SectionTitle> 5. STRUCTURING THE DATA </SectionTitle> <Paragraph position="0"> The innovation that we propose to implement in this computerized dictionary that will allow us to structure and store all of the information discussed above efficiently and accurately is that of the &quot;semantic field. &quot; The theory of semantic fields is not new; what is new is the use of this concept to structure semantic information. Its most appealing characteristic is that it eliminates the need for redundant information (the problem with the feature approach which is widely used) and it makes retrieval much more efficient. First we will discuss the motivation for such a system as a model for semantic structure.</Paragraph> <Paragraph position="1"> Some of the most interesting empirical evidence for semantic fields has been in work done by Marshall and Newcombe in psycholinguistics and by Whitaker, Kehoe, Schnitzer and others in neuro-linguistics.</Paragraph> <Paragraph position="2"> H. A. WmTAKER (1971) has described the remarkable correspondence of the distinct cellular arrays in the cortex of the brain to the classical divisions of the language system: the semantic/syntactic component, the lexicon, and the phonological component.</Paragraph> <Paragraph position="3"> For example, it has been found that the lexicon has an existence apart from the syntactic-semantic (or logical) aspects of language. A case study reported by H. A. WHITAKER (1971), described a woman who was unable &quot;to initiate conversation or to demonstrate general cognitive skills - in brief, the semantic and syntactic aspects of language were totally lost. She was however, able to repeat verbal material well, ... &quot; (p. 190). Whitaker has postulated that the lexicon is a separate neural component, perhaps biochemically coded in nerve cells. That the lexicon, a separate component, is organized in some sort of semantic field arrangement was pointed out again and again by Whitaker. In work done by E. WEIGL and M. BIERWISCH (1970), they described errors which were the results of substitutions of words for other words from the same semantic fields; e.g., trousers for blouse, tie for cuff, bodice for cardigan, sandals for socks, peaches for oranges, bananas for figs, potatoes for vegetables. Of particular note is that the substitutions usually occur at the .same taxonomic level, that is, the substitution is rarely an 316 RAOUL N. SMITH- EDWARD MAXWELL item for the name of the field containing the item (e.g., peaches for fruit).</Paragraph> <Paragraph position="4"> In another study, by H. GOODGLASS, \]3. KLEIN, P. CAREY and K.</Paragraph> <Paragraph position="5"> JoN~s (1966), the investigators chose words which came within the categories of objects, forms, letters, actions, numbers, colors, and body parts. They found that the patients had an easier time understanding object names than producing them, but producing letters was easier for them than understanding them.</Paragraph> <Paragraph position="6"> J. C. MARSrr~t. and F. NEWCOMBE (1966) reported errors such as the following: their patient read liberty as freedom, canary as parrot, abroad as overseas, entertain as entertainment, political as politician and beg as beggar. Later studies of the same patient showed that the patient had twice as much difficulty with verbs than with nouns and that adjectives were harder than nouns but easier than verbs. One of the problems encountered was the patient's tendency to read verbs as the corresponding derived nominal and to read nominals derived from adjectives as the original base form of the adjective. Words like uncle, priest and poet were harder than horse, lion, and insect. Large was read as long, short as small, tall as long, little, as short.</Paragraph> <Paragraph position="7"> H. A. WraTAgra (1971) reports patients who read verbs as their corresponding derived nominal form: decide is read as decision, conceal as concealment, nominate as nomination, portray as portrait, bathe as bath, speak as discussion, remember as memory. Whitaker also reports that the opposite phenomenon has been found where derived forms are read as their base forms: refusal was read as refuse, darkness as dark, whiteness as white, amazement as amaze.</Paragraph> <Paragraph position="8"> Psycholinguistic and anthropological data therefore point to the reality of organization into semantic fiekts and success of information retrieval schemes has often been tied into a division of the semantic universe into fields. It would seem not only an obvious desideratum but a sine qua non in a dictionary to include information of semantic field.</Paragraph> <Paragraph position="9"> Once the data has been recorded so that all words are completely defined we will eliminate redundant information so that storing of the lexicon can be accomplished most economically. The elimination of redundancy will be done by means of structuring the data in a specific way. This method has been discussed in E. MAXWEL~ (t973).</Paragraph> <Paragraph position="10"> In effect what happens is this: the head of a semanticfield (call it L) is defined in a certain way; the members of that semantic field (xl, z~ ...x,) are defined in relation to L. All the information that need be specified to define, zl, etc. is that information that is unique to them.</Paragraph> <Paragraph position="11"> AN ENGLISH DICTIONARY FOR COMPUTERIZED SYNTACTIC 317 For example, there is the semantic field (described in C.J. FmLMO~, 1971) made up of the verbs: judge, accuse, blame, scold, forgive, etc. All of the verbs are verbs of judging (which is the name of the semantic fled.) They are uniquely defined in terms of their presuppositions (i.e.</Paragraph> <Paragraph position="12"> accuse presupposes that the action done is bad)..Therefore, by defining judge and by saying that accuse, etc. are kinds of judging except for their presuppositions all redundant information can be deleted and the specific definitions can be derived with inferential schemata.</Paragraph> <Paragraph position="13"> An example of how the information would be stored is the following (using the word boil):</Paragraph> <Paragraph position="15"/> </Section> <Section position="10" start_page="20" end_page="20" type="metho"> <SectionTitle> SF AGENT OBJECT PLACE INSTR. (' COOK ') (HUMAN) (EDIBLE/POTABLE) (HEATED) (WATER.) </SectionTitle> <Paragraph position="0"> The partial description of the word boil gives the following information: that it is a member of the semantic field &quot;cook &quot;; that the agent must be a member of the semantic field&quot; human&quot;; that the thing boiled must be edible or potable; that the place the boiling is done must be heated (actually this information is redundant since the place for cooking must also be heated); and the instrument in which the boiling is done must be water. The symbol xx means that the object can be sub-ject if no agent is stated: Alice boiled the eggs.</Paragraph> <Paragraph position="1"> The eggs boiled quickly.</Paragraph> <Paragraph position="2"> The parentheses around the operators mean that the choice of place and instrument is optional.</Paragraph> <Paragraph position="3"> Using this model we can state relationships between derivational morphemes and nominalizations that have not as yet been stated in computerized lexicons. (Reliable is passively related to rely: &quot;able to be relied on &quot;; while comfortable is actively related to comfort: &quot;able to comfort &quot;).</Paragraph> <Paragraph position="4"> SUMMARY. Our purpose is to construct a 20,000 word core dictionary of English to be used in computerized natural language using systems. It is to include as much syntactico-semantic information as necessary to be used in most current theoretical frameworks both in 318 RAOUL N. SMITH- EDWARD MAXWELL sentence recognition and production as well as for linguistic studies of English syntax and semantics.</Paragraph> <Paragraph position="5"> We eventually would like to parse the definitions so that this information can be put in some formal notation and used for further dictionary organization but we feel at the moment that our core-English dictionary must be pre-requisite to any such definition parsing (cf. O. WERNER, 1972 for a model to account for taxonomic relations derivable from definitions).</Paragraph> </Section> <Section position="11" start_page="20" end_page="20" type="metho"> <SectionTitle> FROM ONE MORE OF A KIND RELATED TO OR. RESEMBLING ANOTHER KIND THAT IS USU </SectionTitle> <Paragraph position="0"/> </Section> class="xml-element"></Paper>