File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/65/c65-1006_intro.xml
Size: 5,866 bytes
Last Modified: 2025-10-06 14:04:15
<?xml version="1.0" standalone="yes"?> <Paper uid="C65-1006"> <Title>AUTOmaTIC LINGUISTIC CLASSIFICATION</Title> <Section position="3" start_page="6" end_page="6" type="intro"> <SectionTitle> INTRODUCTION </SectionTitle> <Paragraph position="0"> Mechanical translation research over almost two decades has led to a broader discipline, computational linguistics, which already includes within its concern the automated processes that collect, store, retrieve or communicate information conveyed in or about language, as well as translate one language into another. Uith progress in automatic classification, another possibility is being explored, that of creating new information rather than merely gathering, maintaining or distributing the results of human intellectual activities.</Paragraph> <Paragraph position="1"> Many investigators have noted that sophisticated linguistic systems must be capable of learning. A new term may be defined within a text being translated mechanically. Or, more commonly, a new meaning may be given a term used in close communication among colleagues.</Paragraph> <Paragraph position="2"> Bar-Hillel, for one, has waxed and waned in enthusiasm for mechanized linguistic learning, finally relegating even its investigators to the &quot;lunatic fringe&quot; of computational linguistics. Some thoughtful work has been done by Lamb \[i\] . Certainly Solomonoff \[2\] should be mentioned, as should Knowlton \[3\] and Sparck-Jones \[4\], but each for a different reason. There is hardly a literature to cite, unless it be that unruly assemblage we have come to call &quot;artificial intelligence.&quot; The name &quot;self-organizing system&quot; has also come into use. We will adopt it, so that &quot;learning&quot; and &quot;adapting&quot; may distinguish different kinds of self-organization. We observe, furthermore, that to process information one must first process the language (or symbolic system) Pendergraft, Dale I-~ in which the information is given. As a consequence, every information processing system has a component that processes linguistic information. And that component, we now know, may have a subcomponent which processes meta-linguistic information.</Paragraph> <Paragraph position="3"> Self-organizing linguistic systems properly fall within the scope of meta-linguistic processing.</Paragraph> <Paragraph position="4"> The information being processed is about some language, the &quot;object-language&quot; of the communication; hence the vehicle by which the information is conveyed is a &quot;meta-language.&quot; If the self-organizing system has changed the description of the conventional alternatives available within the object-language, then we will say that the system is &quot;learning.&quot; Whether or not the alternatives remain unchanged, if some alteration has been made in the conventions of their use, the system will be &quot;adapting.&quot; Thus, roughly speaking, learning will involve some change in linguistic rules that describe a set of well-defined alternatives in the object-language.</Paragraph> <Paragraph position="5"> Adaptation will involve some change in a set of probabilities that describe how those alternatives are being used.</Paragraph> <Paragraph position="6"> A self-organizing information system, in contrast to one learning or adapting by meta-linguistic processing, would employ linguistic processing to create new information about some subject-matter not necessarily linguistic. But since the information so processed might indeed be about language, we anticipate that linguistic self-organization may be based either meta-linguistically or linguistically.</Paragraph> <Paragraph position="7"> For the present, however, our system will be based on meta-linguistic processing. Work in meta-syntactics Pendergraft, Dale i-3 is progressing rapidly; researchers in computational linguistics now face the obligation of testing hypotheses more rigorously than with heuristic arguments or typical linguistic examples. More careful investigation is need in meta-semantics, i.e. in the relations between meta-linguistic and linguistic information.</Paragraph> <Paragraph position="8"> In essence, then, we will try with automatic linguistic classification to bridge the gap between the design of language and the events of spoken or written discourse. What we have to report is only a small beginning toward that objective.</Paragraph> <Paragraph position="9"> We recognize that these are difficult problems requiring long-range commitments. They are nevertheless central to improving the language data used in automated analysis, synthesis and translation. Moreover, they lead to the concept of a dynamic language data base in linguistic processing.</Paragraph> <Paragraph position="10"> Already it is clear that the amount of information contained in a language description greatly influences efficiency in linguistic processing. Contrary to our former intuition, a simple description may merely be deficient in information so that the search in automated analysis will be extended unduly. There appears, furthermore, to be an optimal size in the syntactical descriptive unit. Thus, in making the transition from syntactical to semantical description (at least for the theories \[5\] we are studying), the basic question is analogous to that in the transition from lexical to syntactical description which gives rise to morphology: viz. what objects are to be classified? We are attacking these semological and morphological problems Pendergraft, Dale 1-4 within the same theoretical structure that determines how the resulting objects are to be classified. Indeed, the two questions appear inseparable.</Paragraph> <Paragraph position="11"> Pendergraft, Dale 2-1</Paragraph> </Section> class="xml-element"></Paper>