File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/93/w93-0103_intro.xml

Size: 4,741 bytes

Last Modified: 2025-10-06 14:05:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="W93-0103">
  <Title>Lexical Concept Acquisition From Collocation Map 1</Title>
  <Section position="2" start_page="0" end_page="22" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The level of the conceptual representation of words can be very complex in certain contexts, but in this paper we assume rather simple structure in which a concept is a set of weighted associated words. We propose an automatic concept acquisition framework based on the conditional probabilities suppliedd by a network representation of lexical relations. The network is in the spirit of Belief Net, but the probabilities are not necessarily Bayesian. In fact this variation of Bayesian Net is discussed recently by (Neal, 1992). We employed the Belief Net with non Bayesian probabilities as a base for representing the statistical relations among concepts, and implemented the details of the computation.</Paragraph>
    <Paragraph position="1"> Belief or Bayesian Nets have been extensively studied &amp;quot;in the normative expert systems (Heckerman, 1991). Experts provided the network with the Bayesian(subjective) probabilities solely based on his/her technical experiences. Thus the net has been also known as a Belief Net among a dozen other names that share all or some of the principles of Bayesian net. The probabilistic model has been also used in the problems of integrating various sources of evidences within sound framework (Cho, 1992). One of the powerful features of Belief Net is that the conditional independences of the variables in the model are naturally captured, on which we can derive a form of probabilistic inference. If we regard the occurrence of a word as a model variable and assume the variables occur within some conditional influences of the variables(words) that previously took place, the Belief approach appears to be appropriate to compute some aspects of lexical relations latent in 1 This work was supported in part by a grant from Korea National Science Foundation as a basic research project and by a grant from Korea Ministry of Science and Technology in project &amp;quot;an intelligent multimedia information system platform and image signal transmission in high speed network&amp;quot;  the texts. The probabilities on dependent variables are computed from the frequencies, so the probability is now of objective nature rather than Bayesian.</Paragraph>
    <Paragraph position="2"> The variation of Belief Net we use is identical to the sigmoid Belief Net by Neal (1992).</Paragraph>
    <Paragraph position="3"> In ordinary Belief Nets, 2 ~ probabilities for a parent variable with n children should be specified. This certainly is a burden in our context in which the net may contain even hundred thousands of variables with heavy interconnections. Sigmoid interpretation of the connections as in artificial neural networks provides a solution to the problem without damaging the power of the network. Computing a joint probability is also exponential in an arbitrary Belief network, thus Gibbs sampling which originates from Metropolis algorithm introduced in 50's can be used to approximate the probabilities. To speed up the convergence of the sampling we adopted simulated annealing algorithm with the sampling. The simulated annealing is also a descendant of metropolis algorithm, and has been frequently used to compute an optimal state vector of a system of variables.</Paragraph>
    <Paragraph position="4"> From the Collocation Map we can compute an arbitrary conditional probabilities of variables. This is a very powerful utility applicable to every level of language processing.</Paragraph>
    <Paragraph position="5"> To name a few automatic indexing, document classification, thesaurus construction, and ambiguity resolution are promising areas. But one big problem with the model is that it cannot be used in real time applications because the Gibbs sampling still requires an ample amount of computation. Some applications such as automatic indexing and lexical concept acquisition are fortunately not real time bounded tasks. We are currently undertaking a large scale testing of the model involving one hundred thousand words, which includes the study on the cost of sampling versus the accuracy of probability.</Paragraph>
    <Paragraph position="6"> To reduce the computational cost in time, the multiprocessor model that is successfully implemented for Hopfield Network(Yoon, 1992) can be considered in the context of sampling. Other options to make the sampling efficient should be actively pursued, and their success is the key to the implementation of the model to the real time problems.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML