The random-trial incremental (RTI) model of human associative learning prop
oses that learning due to a trial where the association is presented procee
ds incrementally, but with a certain probability, constant across trials, n
o learning occurs due to a trial. Based on RTI, identifying a policy for se
quencing presentation trials of different associations for maximizing overa
ll learning can be accomplished via a factored Markov decision proces (MDP)
. For both finite and infinite horizons and a quite general structure of co
sts and rewards, a policy that on each trial presents an association that l
eads to the maximum expected immediate net reward is optimal.