Dj. Ma et Am. Makowski, ON THE CONVERGENCE AND ODE LIMIT OF A 2-DIMENSIONAL STOCHASTIC-APPROXIMATION, IEEE transactions on automatic control, 39(7), 1994, pp. 1439-1442
Citations number
10
Categorie Soggetti
Controlo Theory & Cybernetics","Robotics & Automatic Control","Engineering, Eletrical & Electronic
We consider a two-dimensional stochastic approximations scheme of the
Robbins-Monro type which naturally arises in the study of steering pol
icies for Markov decision processes [6], [7]. Making use of a decoupli
ng change of variables, we establish its almost sure convergence by ad
-hoc arguments that combine standard results on one-dimensional stocha
stic approximations with a version of the law of large numbers for mar
tingale differences. We use this direct analysis to guide us in select
ing the test function which appears in standard convergence results fo
r multidimensional schemes. Furthermore, although a blind application
of the ODE method is not possible here due to a lack of regularity pro
perties, the aforementioned change of variables paves the way for an i
nterpretation of the behavior of solutions to the associated limiting
ODE.