File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/88/c88-2134_intro.xml

Size: 12,001 bytes

Last Modified: 2025-10-06 14:04:42

<?xml version="1.0" standalone="yes"?>
<Paper uid="C88-2134">
  <Title>OPTIMIZATION ALGORITHMS OF DECIPHERING AS THE ELEMENTS OF A LINGUISTIC ~HEORY</Title>
  <Section position="2" start_page="0" end_page="647" type="intro">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> This paper presents an outline of the linguistic theory which may be identified with the partially ordered set of optimization algorithms of deciphering. An algorith~L of deciphering is the operational definition of a given linguistic phenomenon which ha~, the following three components: a set of admissible solutions, an objective function and a proaodure which finds out the mini,4~m or the maximum of the objective function.</Paragraph>
    <Paragraph position="1"> The p~er contains the description of the four algorithms of the proposed type:  ~. The algorithm which classifies the letters into vowels and consonants.</Paragraph>
    <Paragraph position="2"> 2. The ~Lgorithm which identifies the morphemes in the text without the boundaries between words.</Paragraph>
    <Paragraph position="3"> 3. The algorithm which finds out the dependency tree of a sentence.</Paragraph>
    <Paragraph position="4"> 4. The algorithm which finds out the mapping  of the letters of an unknown language into the letters of a known one.</Paragraph>
    <Paragraph position="5"> The forties and the first half of the fifties were marked by the pronounced interest of the linguists to the so-called &amp;quot;discove~r procedures&amp;quot;. These investigations were not very successful at that time. The Chomskyan~'criticism also hindered the progress in this direction.</Paragraph>
    <Paragraph position="6"> There is no reason to revive the old discussions. We will try to show further that the optimization algorithms we propose combine the theoretical generality on the one hand with the practical usefulness on the other. Moreover it appears that the methods of the generative grammar theory and those of the discovery procedures are even not at all contradictory. For example, in a recent work of M.Remmel the set of the admissible solutions is determined as a set of the generative grammars of N.Ohomsky. In this paper we prefer to use the term &amp;quot;deciphering procedures (algorithms~&amp;quot; instead of &amp;quot;discovery procedures&amp;quot;, because the latter implies the operations which are not necessarily formal.</Paragraph>
    <Paragraph position="7"> An algorithm of linguistic deciphering is a formal procedure aimed at the recognit~nn of linguistic objects in a text whose language is not known to the investigator. Assuming that any deciphering procedure may serve as a definition of the respective linguistic object we may vow the set of such procedures as a certain linguistic theory which has the following properties: I) A greatdegree of generalization, because its definitions should be valid both for the known and unknown languages.</Paragraph>
    <Paragraph position="8"> 2) Formality, because naturally enough, the deciphering procedures should be presented in.the shape of algorithms.</Paragraph>
    <Paragraph position="9"> 3) Constructivity, i.e. the possibility of identifying a certain linguistic object with the help of a deciphering procedure within a reasonable time interval.</Paragraph>
    <Paragraph position="10">  To identify a linguistic object a deciphering algorithm makes use of a set of its features. It seems obvious that a linguistic object cannot be defined by means of binary features alone. The following scheme seems to be better founded: I. Binary features are used to determine the general type of certain linguistic objects. The objects belonging to that type form the set of admissible solutions of a deciphering problem.</Paragraph>
    <Paragraph position="11">  2. An objective function which estimates the quality of each solution is  introduced on the set of admissible solutions. The values of the objective function are calculated with the help of the investigated text. They reflect the individuality of the given language. A maximum or a minimum of the objective function should correspond to the linguistic object which is to be defined. 3. It follows that a deciphering procedure should be an optimization algorithm which finds &amp;quot;the best&amp;quot; admissible solution - from the point of vow of the objective function.</Paragraph>
    <Paragraph position="12"> Thus, the set of admissible solutions, the objective function and the optimization algorithm constitute the definition of a linguistic object which may be used for the purposes of deciphering; a definition of this kind will be further referred to as a deciphering algorithm, or simply, an algorithm.</Paragraph>
    <Paragraph position="13"> There is a natural hierarchy of deciphering algorithms. An algorithm B is senior to an algorithm A if the former makes use of the information provided by the latter. If A and B work alternatively each time improving the output, then the seniority is determined by the first iteration. Taking into account the fact that the set of essentially different algorithms should be finite, it appears that there must exist &amp;quot;zero&amp;quot; algorithms which use no information produced by any other deciphering algorithms.</Paragraph>
    <Paragraph position="14"> Zero algorithmz should be different due to the fact that the physical substances of different languages may be different too. Thus the zero algorithm for the analysis of the written form of languages should be able to discriminate between a dark spot and a light one and to identify the place of each spot on the page; it should discover the set of alphabetic symbols of the language. A similar algorithm adjusted to the analysis of audible speech should produce the alphabet of phonemes, exploiting its capacity to discern certain minimal differences of sonation. The plurality of zero algorithms may be reduced by converting signals of different nature into a set of curves. As it is well known such algorithms are the goal of pattern recognition theory.</Paragraph>
    <Paragraph position="15"> Senior algonithms should be used for the analysis of grammar; the highest levels correspond to the problems of semantics and translation.</Paragraph>
    <Paragraph position="16"> ~any algorithms of different levels display great similarity and sometimes even identity, their only difference consisting in the linguistic material which serves as the input. The following types of the algorithms may be pointed out: I. Algorithms of classification, which divide the set of investigated objects  into sew~ral subsets.</Paragraph>
    <Paragraph position="17"> 2. Algorithms of aggregation which form larger units from smaller onesdeg 3. Algorithms of connection which find out some relation of partial ordering.</Paragraph>
    <Paragraph position="18"> 4. Algorithms of mapping the elements  of an unknown language into the elements of a known one.</Paragraph>
    <Paragraph position="19"> The most simple classification algorithm is that which classifies the set of letters A = ~i~ into vowels and consonants.</Paragraph>
    <Paragraph position="20"> In this case an admissible solution is a division , vu C=A, The objective function reflects the fact that letters of the same class co-occur rather rarely whereas letters of different classes co-occur relatively more often; it is formulated as follows: Here f(li,1 j) denotes the frequency of letters I i and lj. The maximum of Q(D) corresponds to the optimal classification. An appropriate optimization procedure reduces the amount of divisions that should be evaluated to a reasonable number. This algoritl~ has been thoroughly tested ina number of computer experiments and in every case yielded almost entirely correct resultsdeg The most important algorithm of aggregation is the morpheme identification algorithm. Apart from identifying morphemes this algorithm discovers an IC graph which shows the way in which morphemes are combined into wordsdeg An admissible solution in this case is a sequence of divisions D~,...,D n of the text, each class of D~E+~ being included in a certain class of D i. A morpheme m is the string of letters at least one occurrence of which should be an element of a certain class of D i.</Paragraph>
    <Paragraph position="21"> The sequence DT,...,D n determines the set of morphemes in a unic way. The objective function is set up by ascribing to each morpheme a certain number q(m) which is great when m consists of the letters which predict each other stronger than they predict the letters of the neighbouring morphemes. A number of sx-periments have been carried out; the best results have been obtained with the help of the following function:</Paragraph>
    <Paragraph position="23"> Here f denotes the frequency of a string, a is the initial, b is the final letter of m, y is a letter which precedes m, x is a letter which follows it, X is a string.</Paragraph>
    <Paragraph position="24"> The best solution should correspond to the maximum of Q(M) = ~ q(mi) , where M = ~mi~. A Russian text of IOOOO letters was chosen for the experiments. Here is an extract of the analysed text:</Paragraph>
    <Paragraph position="26"> Representative of the algorithms of the third type is the algorithm of finding the dependency graph of a sentence. For this purpose the words of the language should be classified into syntactical classes so that we may consider a word v to be included in a class K v. The conditional probability P(Kv/Kw) of occurrence of K v near K w is calculated with the help of the text.</Paragraph>
    <Paragraph position="27">  The set of admissible solutions is the set of all possible dependency trees which may be ascribed to a given sentence. The conditional probabilities provide the weights for the arcs of the tree. The quality of a tree is the sum (or the mean) of the weights of all arcs. The optimal tree presumably has the maximum quality. A great number of the algorithms of this type have been tested in computer experiments; the best ones correctly identified more than 80% of connections. Here is a typical example taken from an experiment which was carried out for a Russian text of ~0000 words: O~Ha~u zrpa~B ~ap~ y A~v. Verb Prep. Acc.Sub. Prep.</Paragraph>
    <Paragraph position="28"> Algorithms of this type may be used for the purposes of machine translation, in which case a greater amount of the input information is needed.</Paragraph>
    <Paragraph position="29"> A typical example of an algorithm which obtains the mapping M = ~E i -+ E~ ~(E i being some elements of the unknown language, E~ - the respective elements of the known one) is furnished by the algorithm which discovers the pronunciation of letters.</Paragraph>
    <Paragraph position="30"> It is based on the ~ypothesis that letters of two different languages which have similar pronunciation possess similar combinatory power as well.</Paragraph>
    <Paragraph position="31"> The oombinatory power of the letter i i may be described by the vector of conditional probabilities G i = P(li/ix) which characterizes the occurrences of I i in the neighbourhood of Ix.I~ the same way, the vector C i = P(li/Ix) characterizes the combinatory power of i~.</Paragraph>
    <Paragraph position="32"> 64B The quality of a mapping may be estimated by the formula:</Paragraph>
    <Paragraph position="34"> Here d denotes the distance (e.g.</Paragraph>
    <Paragraph position="35"> Euclidean) between the vectors C i and C~. All pairs li-~l~, lx--&gt;l ~ belong to the mapping M, so that d may be calculated by the formula:</Paragraph>
    <Paragraph position="37"> The minimum of Q(M) corresponds to the optimal mapping. Some algorithms of this type have been tested with interesting results. It is obvious that a similar EoHHor~ap~e~a HapyMosa Gen. Sub. Gen. Sub.</Paragraph>
    <Paragraph position="38"> algorithm will be able to compile a bilingual dictionary with the entries in the unknown language, although the latter problem is, naturally, far more difficult.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML