File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/c92-1056_intro.xml

Size: 2,817 bytes

Last Modified: 2025-10-06 14:05:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-1056">
  <Title>Lexical Disambiguation using Simulated Annealing</Title>
  <Section position="3" start_page="79968" end_page="79968" type="intro">
    <SectionTitle>
2. Simulated Annealing
</SectionTitle>
    <Paragraph position="0"> The method of simulated annealing \[Metropolis et al., 1953; Kirkpatrick et al., 1983\] is a technique for solving large scale problems of combinatorial minimization. It has been successfully applied to the famous traveling salesman problem of finding the shortest route for a salesman who must visit a number of cities in turn, and is now a standard method for optimizing the placement of circuit elements on large scale integrated circuits. Simulated annealing was applied to parsing by Sampson \[1986\], but since the method has not yet been widely applied to Computational Linguistics or Natural Language Processing, we describe it briefly.</Paragraph>
    <Paragraph position="1"> The name of the algorithm is an analogy to the process by which metals cool and anneal. A feature of this phenomenon is that slow cooling usually allows the metal to reach a uniform composition and a minimum energy state, while fast cooling leads to an amorphous state with higher energy. In simulated annealing, a parameter T which corresponds to temperature is decreased slowly enough to allow the system to find its minimum.</Paragraph>
    <Paragraph position="2"> The process requires a function E of configurations of the system which corresponds to the energy. It is E that we seek to minimize. From a stinting point, a new configuration is randomly chosen, and a new value of E is computed. If the new E is less than the old one, the new configuration is chosen to replace the older. An essential feature of simulated annealing is that even if the new E is larger than the old (indicating that this configuration is farther away from the desired minimum than tile last choice), the new configuration may be chosen. The decision of whether or not to replace the old configuration with the new infelior one is made probabilistically. This feature of allowing the algorithm to &amp;quot;go up hill&amp;quot; helps it to avoid setting on a local minimum which is not the actual minimum. In succeeding trials, it becomes more difficult for configurations which increase E to be chosen, and finally, when the method has retained the same configuration for long enough, that configuration is chosen as the solution. In the travelnig salesman example, the configurations are the different paths through the cities, and E is the total length of his trip. The final configmation is an approximation to the shortest path through the cities. The next section describes how the algorithm may be applied to word-sense disambiguation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML