Nk. Treadgold et Td. Gedeon, SIMULATED ANNEALING AND WEIGHT DECAY IN ADAPTIVE LEARNING - THE SARPROP ALGORITHM, IEEE transactions on neural networks, 9(4), 1998, pp. 662-668
A problem with gradient descent algorithms is that they can converge t
o poorly performing local minima. Global optimization algorithms addre
ss this problem, but at the cost of greatly increased training times.
This work examines combining gradient descent with the global optimiza
tion technique of simulated annealing (SA). Simulated annealing in the
form of noise and weight decay is added to resiliant backpropagation
(RPROP), a powerful gradient descent algorithm for training feedforwar
d neural networks. The resulting algorithm, SARPROP, is shown through
various simulations not only to be able to escape local minima, but is
also able to maintain, and often improve the training times of the RP
ROP algorithm. In addition, SARPROP may be used with a restart trainin
g phase which allows a more thorough search of the error surface and p
rovides an automatic annealing schedule.