File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-2153_metho.xml

Size: 36,841 bytes

Last Modified: 2025-10-06 14:12:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="C88-2153">
  <Title>Machine Tractable Dictionaries as Tools and Resources for Natural Language Processing</Title>
  <Section position="3" start_page="0" end_page="750" type="metho">
    <SectionTitle>
2 Background: The Value of Machine Readable Dictionaries
</SectionTitle>
    <Paragraph position="0"> Dictionaries are language texts whose subject matter is language.</Paragraph>
    <Paragraph position="1"> The purpose of dictionaries is to provide definitions of senses of words and, in so doing, they supply knowledge about not just language, but the world. For years, researchers in computational linguistics (CL) and AI have viewed dictionaries (a) with theoretical interest as a means of investigating the semantic structure of natural language, and (b) with practical interest as a resource for overcoming the knowledge acquisition bottleneck in AI. The knowledge acquisition bottleneck has been viewed by most researchers as too hard a problem to tackle at present.</Paragraph>
    <Paragraph position="2"> However, more optimistic researchers have recently begun to seek methods to overcome it, and have had some success. This difference in attitudes regarding the knowledge acquisition bottleneck is reflected in a lnng-standing difference between two alternative methods of lexicon building: the demo approach and the book approach (Miller 1985; a similar distinction appears in Amsler 1982).</Paragraph>
    <Paragraph position="3"> The demo approach, which has been the dominant paradigm in natural language processing (and AI in general) for the last two decades, does not address the knowledge acquisition bottleneck. This approach is to hand-code a small but rich lexicon for a system that analyses a small number of linguistic phenomena. This is an expensive method as each entry in tire lexicon is prepared individually. Every entry is constructed with foreknowledge of its intended use and hence of the knowledge it should contain. Being designed with only a specific purpose in mind, the knowledge representation runs into problems when scaled up to cover additional linguistic phenomena.</Paragraph>
    <Paragraph position="4"> The alternative, the book approach, confronts the problem of the knowledge acquisition bottleneck. This approach attempts to develop methods for transforming the knowledge within dictionaries or encyclopaedias into some format usable for CL and AI tasks, usually with the aim of covering as large a portion of the language as possible. The problem with dictionary and encyclopaedia entries is that, although they are constructed in a principled manner over many years by professional lexicographers and encyclopaedists, they are designed for human use.</Paragraph>
    <Paragraph position="5"> Recently, interest in the book approach has greatly expanded because a number of MRDs have become available, each causing a flurry of research interest, e.g., The Merriam-Webster New Pocket Dict/onary (Amsler and White 1979; Amsler 1980, 1981), Webster's Seventh New Collegiate Dictionary (Evens and Smith 1983; Chodorow, Byrd, and Heidom 1985; Markowitz, Ahlswede, and Evens 1986; Binot and lensen 1987), and The Longman Dictionary of Contemporary English (Michiels, Mullenders, and Noel 1980; Michiels and Noel 1982; Walker and Amsler 1986; Boguraev, .Briscoe, Carroll, Carter, and Grover 1987; Boguraev and Briscoe 1987; Wilks, Fass, Guo, McDonald, Plate, and Slator 1987).</Paragraph>
    <Paragraph position="6"> The big advantage of MRDs is that now both theoretical and practical concerns are investigable by large-scale computational methods.</Paragraph>
    <Paragraph position="7"> Some of the above research has been into the underlying semantic structure of dictionaries (e.g., Amsler and White 1979; Arnsler 1980, 1981; Chodorow, Byrd, and Heidom 1985). The remainder of the research has been seeking to develop praodeal large-scale methods to extract syntactic information from MRD entries (e.g., Boguraev and Briscoe 1987) and transform that information into a format suitable for other users. This latter research has the effect of transforming a MRD into a limited MTD. We use the word &amp;quot;limited&amp;quot; because such a MTD has only syntactic information presented in a format usable by others; semantic information remains buried in the MRD though this semantic- null information is tile knowledge about language and the world that is needed as a rosource for many (3~ and AI tasks. The next step is therefore to develop large-scale methods to extract both rile syntactic and semantic information from MRD entries and present that information as a data base of acceptable format for potential vsers of that information.</Paragraph>
    <Paragraph position="8"> Within the book approach there are a number of ways one can construct such a MTD. One way is to automatically extract the semantic inlbrmation and build the MTD. We firmly advocate automatic extraction. A secoml way is to extract the semantic information manually and handocode the entire MTD, as is being attempted in tile CYC Project (Lenat, Prakash, and Shepherd 1986; Lenat and Feigenbaum 1987). The main problem with this approach is the volume of eflbrt required: the CY=C Project aims to hand-code one million entries from an encyclopaedia, which will take au estimated two person-centuries of work. We I~;lieve this is a mistaken approach because it wastes precious human resonrocs and makes dubious theoretical assumptions, despite Lenat's claims that their work is theory free (see section 5).</Paragraph>
    <Paragraph position="9"> Which e,ler form of the book approach is taken, there are two sets of issues that must be faced by those developing methods for tile transformation of MRDs into MTDs. One set of issues concerns tile nature of the knowledge in MRDs. The second set of issues eoncems the design of the database format of a MTD. Both sets of issues rest on understanding the structure and content of the knowledge that is both explicitly and implicitly encoded in dictionaries, but such understanding rests on certain key issues in semantics. We examine some of these issues in the next section.</Paragraph>
  </Section>
  <Section position="4" start_page="750" end_page="750" type="metho">
    <SectionTitle>
3 Background: Tile State of Semantic Theory
</SectionTitle>
    <Paragraph position="0"> q\]aere me obstacles to the development of methods (whether manual or auttJmatic) for the transformation of the semantic information from MRDs into a MTD tbat have not been present for those developing methods for syntactic analysis only. The main obstacle is that, compared to syntactic theory, understanding of semantic theory is much less advanced, as zhown by the lack of consensus about even some of the general underlying principles of semantics. Nevertheless there is some uralerstanding and some local consensus on semantics that can allow work to proceed.</Paragraph>
    <Paragraph position="1"> Positions on certain basic issues in semantics affects one's stance concerning what semantic information should be extracted from a MRD and represented in a MTD. In developing our own methods for the transformation of MRDs into MTDs, we have adopted a certain approach from computational semantics. Examples of this approach are Prefereuce Semantics (Wilks 1973, 1975a, 1975b) and Collative Semantics (Fass 1986, 1987, 1988). The main assumptions of this approach ate the inescapable problem of the word sense and tile inseparability of knowledge and language.</Paragraph>
    <Paragraph position="2"> Lexical ambiguity is pervasive in language: the lexical ambiguity of words has been a problem since before tire advent of dictionaries and is particularly apparent when translating between languages; tasks such as t~'anslation cannot be modelled by computer without some repmsenta-tion of lexical anlbiguity. Furthermore, lexical ambiguity is pervasive in most forms of language text, including dictionary definitions: fire words used in dictiohary definitions of words and their senses are themselves lexically ambiguous and must be disambiguated.</Paragraph>
    <Paragraph position="3"> We also believe that it is acceptable for a semantics to be based on ti~e approach to lexical ambiguity taken by traditional lexicography that constructs dictionaries. The major problem with the approach comes from its arbitrariness in the selection of senses for a word. This arbitrariness appears between dictionaries in different sense segmentations of the same word. It is also observable within a single dictionary when :the sense-distinctions made for the definition of a word do not match with rile uses of that word in the definitions of other words in the dictionary. Thes,: problems can be overcome by methods that reconcile different sen,'~ ~. selections of a word within and across dictionaries by extending (nr a-xlucing) the coverage of individual word senses.</Paragraph>
    <Paragraph position="4"> Our positkm on the inseparability of knowledge and language is that commrm pfinciple~ underlie the semantic strncturo of language text and of lulowledge ropresentati0ns, and that some kinds of language text structures are a model for knowledge structures (Wilks 1978). Examples of such knowledge structures include the planes of QuiUian's Memory Model (1967, 1968), pseudo-texts from Preference Semantics, sense-frames from Collative Semantics, and integrated semantic units or ISUs (Gut 1987). Supporting evidence comes from comparisons between the~ semantic structure of dictionaries and the underlying organisation of knowledge representations, which have observed similarities between them (Amsler 1980; Cbodorow, Byrd, and Heidom 1985).</Paragraph>
    <Paragraph position="5"> These positions on semantics suggest the following for those engaged in transforming MRDs into MTDs. First, the problem of lexieal ambiguity must be faced by any approach seeking to extract semantic information from a MRD to build a MTD. Because lexical ambiguity exists in the language of dictionary definitions and in language generally, it follows that the language in MRD definitions needs to analysed to the word sense level and must be represented in the format of the MTD. Second, the format of the MTD, while being of principled construction, should be as language-like as possible.</Paragraph>
    <Paragraph position="6"> Next, we focus attention onto some basic issues in transforming MRDs concerning the nature and accessibility of the knowledge in dictionaries. null</Paragraph>
  </Section>
  <Section position="5" start_page="750" end_page="750" type="metho">
    <SectionTitle>
4 The Analysis of MRDs
</SectionTitle>
    <Paragraph position="0"> We hold that those who advocate tile extraction (both manual and automatic) of semantic information from dictionaries (and .even encyclopaedias) have made certain assumptions about the exten t of kr/owledge in a dictionary, about where that knowledge is located; and how that knowledge can be extracted from the language of dictionary definitions.</Paragraph>
    <Paragraph position="1"> These are not assumptions about semantics, but rather, are assumptions about the extraction of semantic information from.text. These assumptions are methodological assmnptions because they underlie the decisions made in choosing one method for semantic analysis rather than another. These assumptions are about (a) sufficiency, (b) extricability, and (c) bootstrapping.</Paragraph>
    <Paragraph position="2"> Sufficiency addresses whether a dictionary is' a strong enough knowledge base for English, specifically as regards the linguistic knowledge and, above all, the knowledge of the real world needed for subsequent text analysis. Sufficiency is of general concern, including hand-coding projects like CYC, where they attempt to make explicit (a) the facts and heuristics which one would need in order to understand sentences, (b) generalisations of those facts and heuristics, and (c) inferences that fall inter-sentential gaps (Lenat and Feigenbaum&amp;quot; 1987, p.llS0).</Paragraph>
    <Paragraph position="3"> Extrieability is concerned with whether it is possible to specify a set of computational procodures that operate on a MRD and exlraet, through their operatirm alone and without any human intervention, general and reliable semantic information on a large scale, and in a general fomlat suitable for, though independent of, a range of subsequent text analysis processes.</Paragraph>
    <Paragraph position="4"> Bootstrapping refers to the process of collecting the initial infm'mation that is required by a set of computational procedures that are able to extract semantic information from the sense definitions in an MRD. The initial information needed is commonly linguistic information, notably syntactic and case information, which is used during the parsing of sense-definitions into an underlying representation from which semantic information is then extracted.</Paragraph>
    <Paragraph position="5"> Bootstrapping methods can be internal or external. Intemal bootstrapping methods obtain the initial information needed Ior their procedures from the dictionary itself and use the procedures to extract that information. This is not as circular as it may seem. A process may require information for the analysis of some sense-definition (e.g., some knowledge of the words used in the definition) and may be able to find that information elsewhere in the dictionary. By contrast, external bootstrapping methods obtain the initial information for their procedures by some method other than the use of those procedures. The initial ~information may be from some source extemal to the dictionary or may ~be in the dictionary but impossible to extract without the use of the very !same information. For example, the word 'noun' may have a definition !in a dictionary but the semantic information in that definition cannot be extracted without prior knowledge of a sentence grammar that contains !knowledge of syntactic categories, including what a noun is.</Paragraph>
    <Paragraph position="6"> Those who advocate hand-coding presumably have pessimistic views about extricabiUty and bootstrapping.</Paragraph>
  </Section>
  <Section position="6" start_page="750" end_page="750" type="metho">
    <SectionTitle>
5 The Production of MTDs
</SectionTitle>
    <Paragraph position="0"> The main issue here concerns tile format that MTDs should have.</Paragraph>
    <Paragraph position="1"> One thing is clear: the format must be versatile if a variety of consumers in CL and AI are to use it. The most likely initial consumers are it hose that place a considerable emphasis on fire availability of words, 75\].</Paragraph>
    <Paragraph position="2"> such as spelling correction, and those that already use large lexicons, soch as .machine translation (MT) and word processing (Amsler 1982). Within CL, twoprimary consumers are the semantics mentioned in section 3, Preference Semantics and Collative Semantics.</Paragraph>
    <Paragraph position="3"> These consumers need a variety of semantic information. To meet these needs MTD formats should be clean, unambiguous, preserve much of the semantic structure of natural language, and contain as much information as is feasible. However, this does not mean that the format of a MTD must consist of just a single type of representation because it is possible that different kinds of information require different types of representation. For example, two kinds of information about word use are: (a) the use of senses of words in individual dictionary sense definitions, and (b) the use of words throughout a dictionary. It is not clear that a single representation can record beth (a) and (b) because (a) requires a frame-like representation of the semantic structure of sense definitions that records the distinction between genus and differentia, * the subdivision of differentia into case roles, and the representation of sense ambiguity, whereas (b) requires a matrix or network-like representation of word usages that encodes the frequency of occurrence of words mad the frequency of co-occurrence of combinations of words. Hence, a MTD may consist of several representations, each internally uniform.</Paragraph>
    <Paragraph position="4"> Given the arguments presented in section 3, we believe that the first of these representations should be modelled on natural language though it should be more systematic and without its ambiguity. Hence, this component representation should be as language text-like as possible and should represent word senses, whether explicitly or implicitly. Other approaches to the building of representations that contain semantic information extracted from dictionaries and encyclopaedias (e.g., Binot and Jerlsen 1987; Pustejovsky and Bergler 1987; CYC) separate knowledge and language and overlook the problem of the lexical ambiguity of the words in dictionary definitions (these are the underlying theoretical assumptions made by these approaches).</Paragraph>
    <Paragraph position="5"> The other representation form oI&amp;quot; representation can be construed as a connectionist netwoik representation, based on either localist (e.g., Cottrell and Small 1983; Waltz and Pollack 1985) or distributed approaches to representation (e.g., Hinton, MeClelland and Rumelhart 1986; St.Jofin and McClelland 1986). Like our position on semantics, connectionism emphasises the continuity between knowledge of the language and the world and many connecfinnist approaches have paid special attention to representing word sense~, especially the fuzzy boundaries between them (e.g., Cottrell and Small 1983; Waltz and Pollack 1985; St.John and MeClelland 1986). Localist approaches assume symbolic network representations whose nodes are word senses and whose arcs are weights that indicate the relatedness of the word senses at the ends of the arcs. An interesting new approach, which we shall outline shortly in section 6.1, uses a network whose nodes are words and whose arc weights indicate co-oecurrence data between words. Although this approach initially appears to be loealist, it is being used to derive more distributed representations which offer ways of avoiding some serious problems inherent in localist representations. Such frequency-ofassociation data is not represented in standard knowledge 'representation schemes, is complementary to the knowledge in such schemes, and may be useful in its own right for CL tasks such as lexieal ambiguity resolution and spelling correction.</Paragraph>
    <Paragraph position="6"> To summarise so far, we have outlined: (1) some basic theoretical assumptions about semantics and our position regarding those assumptions (inseparability of language and knowledge, taeiding the problem of the word sense), (2) some basic methodological assumptions about the extraction of semantic information from dictionaries (sufficiency, extricability, bootstrapping), and (3) some basic theoretical assumptions r regarding the format of a MTD (language-like format, inclusion of dif* ferent kinds of semantic information, notably lexical ambiguity).</Paragraph>
  </Section>
  <Section position="7" start_page="750" end_page="753" type="metho">
    <SectionTitle>
6 Three Approaches to the Transformation of MI~Ds into MTDs
</SectionTitle>
    <Paragraph position="0"> At CRL, we are pursuing three approaches to the automatic translation of the information in The Longman Dictionary of Contemporary English (Proctor et al 1978) into a MTD. LDOCE is a full-sized dictionary designed for learners of English as a second language that contains over 55,000 entries in book 'form and 41,100 entries in machine-readable form (a type-setting tape). The preparers of LDOCE claim that entries are defined using a &amp;quot;controlled&amp;quot; vocabulary of about 2,000 words and that the entries have a simple and regular syntax. We have analysed the machine-readable tape of LDOCE and found that about 2,219 words are commonly used.</Paragraph>
    <Paragraph position="1"> The three CRL approaches all fall within the general position on computational semantics outlined above and are extensions of fairly! well established lines of research. All three approaches also pay special attention to their underlying methodological assumptions concerning the extraction of semantic information from dictionaries. With respect to sufficiency and extricability, all three approaches assume that dic-i tionaries do contain sufficient knowledge for at least some CL applications, and that such knowledge is extricable. But the approaches differ over bootstrapping, i.e., over what knowledge, if any, needs to be hand-coded into an initial analysis program for extracting semantic information.</Paragraph>
    <Paragraph position="2"> The three approaches differ in the amount of knowledge they start with and the kinds of knowledge they produce. All begin with a degree of hand-coding of initial information but are largely automatic. In each case, moreover, the degree of hand-coding is related to the source and nature of semantic information sought by the approach. Approach I, a connectionist approach, uses the least hand-coding but then the co-occurrence data it generates is the simplest form of semantic information produced by any of the approaches. Approach lI requires the hand-coding of a grammar and semantic patterns used by its parser, but not the hand-coding of any lexical material. This is because the approach builds up lexical material from sources wholly within; LDOCE. Approach III employs file most hand-coding because it develops and builds lexical entries for a very carefully controlled defining vocabulary of 3,600 word senses (1,200 words). The payoff is! that the approach will produce a MTD containing highly structured! semantic information.</Paragraph>
    <Section position="1" start_page="750" end_page="752" type="sub_section">
      <SectionTitle>
6.1 Approach I: Obtaining and Using Co-Occurrence Statistics
</SectionTitle>
      <Paragraph position="0"> from LDOCE (Tony Plate) Our first approach extracts semantic information from text (specifically LDOCE) that does not require any semantic information to bootstrap it. Central to this technique is that all sentences that contain a word are used as sources of information about the use of that word, rather than just the definition of the word. This technique is based on some experimental findings that the frequency of co-occurrence of a pair of words provides a reasonable measure of the strength of the semantic relationship between them.</Paragraph>
      <Paragraph position="1"> This approach bears some resemblance to Sparck Jones's (1964) investigation into the semantic classification of the uses of words. Her underlying linguistic assumption was that the uses of words may be distinguished, described, or analyzed by the semantic relations which hold between them and the vocabulary of a language has a semantic structure determined by these relations. Of twelve possible semantic relations, synonymy was chosen as the fundamental feature of natural language.</Paragraph>
      <Paragraph position="2"> Despite some surface similarities to Sparek Jones's technique there are many differences, some of which are discussed below. First, Sparek Jones's data collection method is much more laborious than the co-occurrence method (see Wilks, Fass, Gut, McDonald, Plate, and Slator 1987).</Paragraph>
      <Paragraph position="3"> Second, Sparck Jones's method requires that the data must contain all the senses of words that need to be considered. In the co-occurrence method, it is not necessary that the text contain examples of all senses, because the sense definitions are used to provide information about the senses. The text need only use enough senses of words to define all words, but should make frequent use of the senses it does use.</Paragraph>
      <Paragraph position="4"> The approach proposed here finds much more distant and general relationships than synonymy, and which involve the combination of many semantic relations. Co-occurrence data for the LDOCE controlled vocabulary has been collected. This data contains ngarly two-and-a-half million frequencies of co,occurrence (the triangle of a 2200 by 2200 matrix). This is too much data to examine in raw form, so we have used two techniques to convert the data into a more understandable format. null We have written a program called BROWSE which can manipulate the entire co-occurrence matrix and can select groups of words based on whether the values of various probability functions pass selected thresholds. These groups of words can be manipulated as sets, and one technique we are using is to build sets of words that are either related to a certain word or to a certain sense of a word.</Paragraph>
      <Paragraph position="5"> The other technique involves using BROWSE to extract submatrices which are then given to the PATHFINDER program (Schvaneveldt, Durso and Dearholt 1985). This program was designed  to discover the network structure of psychological data and it reduces the total amount of information while not eliminating much of useful infomlation. We have applied this program lo LDOCE co-occurrence data with solae success; it produces sparsely connected netwmks which ,are easy to examine by eye and which appear to contain much useful world kalowhMge.</Paragraph>
      <Paragraph position="6"> In both tormats (groups of words and PA'I'I~qNDER networks) the data is a potentially useful re~uree for a number of applications. Of particulm hlterest is the possibility of sense disambiguation. To itwestigate this, we have written a number of processes that use the co-occurrence data. One process we are studying involves rating the coherence of particular sense assignments for sentences, based on the set overlap of the groups of words related to each of the assigned senses. Another process we are studying is how activation spreading from the nodes in a network produced by rile PATItFINDER program can select the appropriate senses of words in context.</Paragraph>
      <Paragraph position="7"> Tile work has strong links to commctionism, and indeed we are investigating how this work can proceed within the connectionist paradigm. We are developing a theory of representation, utilisation, and learning of networks within distributed connectionist models. In addition, we have been developing a conneetionist simulator for file lntel hypercube; this work is well under way (see Plate 1987).</Paragraph>
    </Section>
    <Section position="2" start_page="752" end_page="752" type="sub_section">
      <SectionTitle>
6.2 Approach lI: A Lexicon-Producer (Brian M. Slator)
</SectionTitle>
      <Paragraph position="0"> While the first approach begins with no prior knowledge needed at all, the other two approaches begin with certain kinds of extemal inlbrmation supplied. The second approach (Slator and Wilks 1987; Slator 1988) hand-codes a grammar, some semantic patterns, and a list of the 2,219 words of the LDOCE controlled vocabulary. The approach seeks to build dictionary entries for the words of the controlled vocabulary and the other words in LDOCE using semantic infbrmation extracted from uot only the dictionary entries of LDOCE, as in the other two approaches, I,ut also from the box and pragmatic codes found on the machine readable version of LDOCE (though not in the book). Tim box codes use a ,;pecial set of primitives such as &amp;quot;abstract,&amp;quot; &amp;quot;concrete,&amp;quot; and &amp;quot;animate,&amp;quot; organised into a type hierarchy. The primitives are use d to assign type restrictions on nouns and adjectives, and type res~ trictions on tile arguments of verbs. The pragmatic codes (also called &amp;quot;subject&amp;quot; codes but relErred to here as &amp;quot;pragmatic&amp;quot; codes to avoid coni'usion with the grammatical subjec0 um another special set of primitivcs organis\[:d into a hierarchy. The hierarchy consists of maiu headings such as &amp;quot;engineering&amp;quot; and subheading like &amp;quot;electrical.&amp;quot; The primitives are used to classify words by their subject, for example, one sense of 'cunent' is classilied as &amp;quot;geography:geology&amp;quot; while another sense is marked &amp;quot;engineering/electrical.&amp;quot; &amp;quot;llle semantic information is extracted from LDOCE dictionary entries using a large-scale parser that isolates the genus mad differentia terms in each entry, expanding upon other similar work (e.g., Chodorow, Heidom, and Byrd 1985; Alshawi, Boguraev, and Briscoe 1985; Boguraev and Briscoe 1987; Binot and Jensen 1987).</Paragraph>
      <Paragraph position="1"> The dictionary entries that are built for individual word senses are ti'ame-based lexical semantic structures, intended for subsequent use in knowledge ha~d parsing. The process of building a frame for a word sense begins hy filst assigning the box and pragmatic code information from LDOCE for that word sense. The parser then analyses the definition of that word sense from LDOCE.</Paragraph>
      <Paragraph position="2"> The par.~er is a chart parser (taken from Slocum 1985) which is left-comer and bottom-up with top-down filtering and early constituent tests. Chart parsing was selected because of its utility as a grammar testing and development tool. The parser is driven by a context free grammat' of over 100 rules and a lexicon composed of the 2,219 words from the LDOCE ecmtrolled vocabulary. It must be emphasised that this chart parser is not a parser for Englisfi -- it is a parser for just the language of LDOCE d(,finitions. The grammar is still being tuned, but currently covers over 90% of tile language used in LDOCE definitions of content words.</Paragraph>
      <Paragraph position="3"> Th~ parser produces a phrase-structure tree of an LDOCE word sense definition which is passed to an interpreter tot pattern matching and inferencing. The intelpreter extracts the dominating phrase, reorganises tile phrase into genus and differentia components, ,and attempts to infer and fill ill case roles flint subdivide the differentia information. The interpreter then accesses the pre-exisfing frame for the word sense, wlfich already contains the relevant box and pragmatic code information for the word sense, and enriches the frame by adding the genus and differentia infommtion extracted from its definition.</Paragraph>
      <Paragraph position="4"> Consider, for example, how the frame for 'ammeter' is built.</Paragraph>
      <Paragraph position="5"> From the box and pragmatic codes, the following hierarchical information is extracted and used to create an initial frame for 'ammeter': from the box code, that an ammeter is of type &amp;quot;solid,&amp;quot; and from the pragmatic code, that an ammeter is classified under the subject &amp;quot;engineering/electrical.&amp;quot; Next, the chart parser is used to analyse the LDOCE definition of an 'ammeter', which is that it &amp;quot;is an instrument for measuring ... electric current.&amp;quot; The definition is parsed into a phrase-structure tree which is passed to the interpreter. The interpreter adds to tile frame lbr 'ammeter' that 'instrument' is its gemm and &amp;quot;for measuring electric current&amp;quot; is its differentia infom~ation. Furthermore, tile interpreter notes the phrase &amp;quot;for measuring&amp;quot; and creates the case role slot PUR-POSE, i.e., that the purpose of an ammeter is for measuring electric current.</Paragraph>
    </Section>
    <Section position="3" start_page="752" end_page="753" type="sub_section">
      <SectionTitle>
6.3 Approach Ill\[: Building a MTD from a Key Defining Vocabu-
lary (Cheng=ming Guo)
</SectionTitle>
      <Paragraph position="0"> The third approach, unlike the first and the second, argncs that a small mnount of hand-coding of world knowledge is necessary before the bootstrapping process can begin. The amount of hand-coding required, though more than the other approaches, is still relatively small because over 95% of its MTD is built automatically. The prior world knowledge that requires hand-coding is a set of 1,200 words, called the Key Delining Vocabulary (KDV), which has been found to define the controlled vocabulary of LDOCE, and thence all 27,758 words defined in LDOCE. The .senses of all the words in LDOCE call be defined by the KDV ill a series of four &amp;quot;defining cycles.&amp;quot; When a candidate word enters a defining cycle, the stems of the words used in the definitions of the first three senses of that candidate word are examined. If all the word stems in those three sense definitions occur in the KDV, then the candidate word is put into a &amp;quot;success&amp;quot; file and added to the KDV at the end of the defining cycle; if not, the word is put into a &amp;quot;fail&amp;quot; file and addition of the word to the KDV is postponed until a later cycle. In this way, tile size of the KDV expands with each cycle until, after three cycles, all the words from the LDOCE controlled vocabulary are accounted for. The remaining words ill LDOCE is expected to be defined ill the next defining cycle.</Paragraph>
      <Paragraph position="1"> The discovery of tile KDV and the use of defining cycles is valuable tor a number of reasons. First, in building a MTD, a KDV reduces the initial number of knowledge structures for dictionary entries that have to be hand-coded before such structures can be constructed antomatically by some bootstrapping process. The knowledge structures used in this particular study are called &amp;quot;integrated semantic units&amp;quot; or ISUs. Though the preliminary study reported here uses a KDV of around 1,200, tile number can probably be reduced to about 1,000.</Paragraph>
      <Paragraph position="2"> Second, the use of defining cycles helps to identify vacuous circular definitions. Circular definitious that use circles of just two words pose special problems for building a MTD from a MRD. For example, in LDOCE a &amp;quot;trip&amp;quot; is defined as a &amp;quot;journey&amp;quot;, and a &amp;quot;journey&amp;quot; as a &amp;quot;trip&amp;quot;. A MTD built from a MRD should be free Of such circular definitions. One way to overcome such circular definitions is to try and include just one of the words involved as a KDV word, but not the other. The wofd selected for tile KDV will be the one whose first three senses fulfil the criteria of a defining cycle given earlier.</Paragraph>
      <Paragraph position="3"> Thirdly, when constructing a MTD, use of the defining cycles ensures that all definitions of words and their senses that are built contain only words that already have definitions. In the case of LDOCE, use of the defining cycles sorts out words in the LDOCE controlled vocabulary whose definitions include words outside of that vocabulary.</Paragraph>
      <Paragraph position="4"> This has proved to be not uncommon in LDOCE definitions.</Paragraph>
      <Paragraph position="5"> Fourthly, in building a MTD, file main senses of these empirically found KDV words are taken as the &amp;quot;semantic primitives&amp;quot; of the MTD. The use of defining cycles ensures that a set of primitives that best suit a particular MRD eat.~ be found empirically.</Paragraph>
      <Paragraph position="6"> An estimated 3,600 1SUs for an average of three basic senses of the 1,200 KDV words are to be hand-coded to start the bootstrapping process (the bootstrapping process is shown schematically in Figure 2 of the Appendix, p.14). A language analyzer and leamer (LAAL) carties out tile bootstrapping process according to a bootstrapping schedule (as with approach II, any grammar rules or semantic patterns used by the LAAL will have to be hand-coded). The bootstrapping schedule is  cone~rned with wIiich word sm~es are m l~e proecssed frst, and which later. The ~eces,,;iiy 10r the boolslrappirlg schcduic stems Irom the l~tci; that the ISUs l'~}r the basic senses of the words in the dcfnition of a word ~nse have to b~ in the ISII dNabase beh~re that defirfition cma be analysed and its ISU produced. After the ISlls for the basic word senses of the wordg flora the LDOCE c~mlrotted vocabnlary mc built into the d~Rabase, the non.basic senses of fl~cae veords will be processed. When a/! of the controlled vucabulaw words ale finished, words front Outside am c, mtrollcd vo~.:abulmy will be attended to. Following the bootstrapping sclmdnie~ the LAAI_. system processes word sense delinitions to prodnee lnom an0 morn lS//s until |he entire I.DOCE is turned into a ~ifll MTD of ISUs.</Paragraph>
      <Paragraph position="7"> Furtber details abmut the three approacbes may be fonnd in the Appendix (Wilka, Fass, Gun, Mel)onakt, l'late, arid Slamr 198&amp;quot;/). The M'FDs pcoduccd by these approache~ are fed into a nnmbEr of consuo mers: a Lcxicon..Consumer (Slator mui Wii!..':; t987) and Collative Semantics.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML