File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/98/p98-1053_abstr.xml
Size: 3,476 bytes
Last Modified: 2025-10-06 13:49:16
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1053"> <Title>Accumulation of Lexical Sets: Acquisition of Dictionary Resources and Production of New Lexical Sets</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> This paper presents our work on accumulation of lexical sets which includes acquisition of dictionary resources and production of new lexical sets from this. The method for the acquisition, using a context-free syntax-directed translator and text modification techniques, proves easy-to-use, flexible, and efficient.</Paragraph> <Paragraph position="1"> Categories of production are analyzed, and basic operations are proposed which make up a formalism for specifying and doing production.</Paragraph> <Paragraph position="2"> About 1.7 million lexical units were acquired and produced from dictionaries of various ~pes and complexities. The paper also proposes a combinatorial and dynamic organization for lexical systems, which is based on the notion of virtual accumulation and the abstraction levels of lexical sets.</Paragraph> <Paragraph position="3"> Keywords: dictionary resources, lexical acquisition, lexical production, lexical accumulation, computational lexicography.</Paragraph> <Paragraph position="4"> Introduction Acquisition and exploitation of dictionary resources (DRs) (machine-readable, on-line dictionaries, computational lexicons, etc) have long been recognized as important and difficult problems. Although there was a lot of work on DR acquisition, such as Byrd & al (1987), Neff & Boguraev (1989), Bl~isi & Koch (1992), etc, it is still desirable to develop general, powerful, and easy-to-use methods and tools for this.</Paragraph> <Paragraph position="5"> Production of new dictionaries, even only crude drafts, from available ones, has been much less treated, and it seems that no general computational framework has been proposed (see eg, Byrd & al (1987), Tanaka & Umemura (1994), Don&quot; & al (1995)).</Paragraph> <Paragraph position="6"> This paper deals with two problems: acquiring textual DRs by converting them into structured forms, and producing new lexical sets from those acquired. These two can be considered as two main activities of a more general notion: the accumulation of lexical sets. The term &quot;lexical set&quot; (LS) is used here to be a generic term for more specific ones such as &quot;lexicon&quot;, &quot;dictionary&quot;, and &quot;lexical database&quot;. Lexical data accumulated will be represented as objects of the Common Lisp Object System (CLOS) (Steel 1990). This object-oriented high-level programming environment facilitates any further manipulations on them, such as presentation (eg in formatted text), exchange (eg in SGML), database access, and production of new lexical structures, etc; the CLOS object form is thus a convenient pivot form for storing lexical units. This environment also helps us develop our methods and tools easily and efficiently.</Paragraph> <Paragraph position="7"> In this paper, we will also discuss some other relevant issues: complexity measures for dictionaries, heuristic decisions in acquisition, the idea of virtual accumulation, abstraction levels on LSs, and a design for organization and exploitation of large lexical systems based on the notions of accumulation.</Paragraph> </Section> class="xml-element"></Paper>