The model of a non-Bayesian agent who faces a repeated game with incom
plete information against Nature is an appropriate tool for modeling g
eneral agent-environment interactions. In such a model the environment
state (controlled by Nature) may change arbitrarily, and the feedback
/reward function is initially unknown. The agent is not Bayesian, that
is he does not form a prior probability neither on the state selectio
n strategy of Nature, nor on his reward function. A policy for the age
nt is a function which assigns an action to every history of observati
ons and actions. Two basic feedback structures are considered. In one
of them - the perfect monitoring case - the agent is able to observe t
he previous environment state as part of his feedback, while in the ot
her - the imperfect monitoring case - all that is available to the age
nt is the reward obtained. Both of these settings refer to partially o
bservable processes, where the current environment state is unknown. O
ur main result refers to the competitive ratio criterion in the perfec
t monitoring case. We prove the existence of an efficient stochastic p
olicy that ensures that the competitive ratio is obtained at almost al
l stages with an arbitrarily high probability, where efficiency is mea
sured in terms of rate of convergence. It is further shown that such a
n optimal policy does not exist in the imperfect monitoring case. More
over, it is proved that in the perfect monitoring case there does not
exist a deterministic policy that satisfies our long run optimality cr
iterion. In addition, we discuss the maxmin criterion and prove that a
deterministic efficient optimal strategy does exist in the imperfect
monitoring case under this criterion. Finally we show that our approac
h to long-run optimality can be viewed as qualitative, which distingui
shes it from previous work in this area.