NEURAL APPROXIMATIONS FOR INFINITE-HORIZON OPTIMAL-CONTROL OF NONLINEAR STOCHASTIC-SYSTEMS

Citation
T. Parisini et R. Zoppoli, NEURAL APPROXIMATIONS FOR INFINITE-HORIZON OPTIMAL-CONTROL OF NONLINEAR STOCHASTIC-SYSTEMS, IEEE transactions on neural networks, 9(6), 1998, pp. 1388-1408
Citations number
26
Categorie Soggetti
Computer Science Artificial Intelligence","Computer Science Hardware & Architecture","Computer Science Theory & Methods","Computer Science Artificial Intelligence","Computer Science Hardware & Architecture","Computer Science Theory & Methods","Engineering, Eletrical & Electronic
ISSN journal
10459227
Volume
9
Issue
6
Year of publication
1998
Pages
1388 - 1408
Database
ISI
SICI code
1045-9227(1998)9:6<1388:NAFIOO>2.0.ZU;2-7
Abstract
A feedback control law is proposed that drives the controlled vector u psilon(t) of a discrete-time dynamic system tin general, nonlinear) to track a reference upsilon(t) over an infinite time horizon, while mi nimizing a given cost function tin general, nonquadratic), The behavio r of upsilon(t) over time is completely unpredictable. Random noises act on the dynamic system and the state observation channel, which may be nonlinear, too. The random noises and the initial state are, in ge neral, non-Gaussian; it is assumed that all such random vectors are mu tually independent, and that their probability density functions are k nown. As is well known, so general a non-LQG (linear quadratic Gaussia n) optimal control problem Is very difficult to solve. The proposed so lution is based on three main approximating assumptions: 1) the optima l control problem is stated in a receding-horizon framework where upsi lon(t) is assumed to remain constant within a shifting-time window; 2 ) the control law is assigned a given structure (the one of a multilay er feedforward neural network) in which a finite number of parameters have to be determined in order to minimize the cost function (this mak es it possible to approximate the original functional optimization pro blem by a nonlinear programming one); and 3) the control law is given a ''limited memory,'' which prevents the amount of data to be stored f rom increasing over time. The errors resulting from the second and thi rd assumptions are discussed. Due to the very general assumptions unde r which the approximate optimal control law is derived, we are not abl e to report stability results. However, simulation results show that t he proposed method may constitute an effective tool for solving, to a sufficient degree of accuracy, a nide class of control problems tradit ionally regarded as difficult ones tan example of freeway traffic opti mal control is given that may be of practical importance).