This paper deals with the design and analysis of a modified version of the
Bush-Mosteller reinforcement scheme applied by partners in a zero-sum repea
ted game with random pay-offs. The suggested study is based on the learning
automata paradigm and a limiting average reward criterion is tackled to an
alyse the arising Nash equilibrium. No information concerning the distribut
ion of the pay-off is a priori available. The novelty of the suggested adap
tive strategy is related to the incorporation of a 'normalization procedure
' into the standard Bush-Mosteller scheme to provide a possibility to opera
te not only with binary but also with any bounded rewards of a stochastic n
ature. The analysis of the convergence (adaptation) as well as the converge
nce rate (rate of adaptation) are presented and the optimal design paramete
rs of this adaptive procedure are derived. The obtained adaptation rate tur
ns out to be of o(n(-1/3)).