J. Schmidhuber et al., SHIFTING INDUCTIVE BIAS WITH SUCCESS-STORY ALGORITHM, ADAPTIVE LEVIN SEARCH, AND INCREMENTAL SELF-IMPROVEMENT, Machine learning, 28(1), 1997, pp. 105-130
We study task sequences that allow for speeding up the learner's avera
ge reward intake through appropriate shifts of inductive bias (changes
of the learner's policy). To evaluate long-term effects of bias shift
s setting the stage for later bias shifts we use the ''success-story a
lgorithm'' (SSA). SSA is occasionally called at times that may depend
on the policy itself. II uses backtracking to undo those bias shifts t
hat have not been empirically observed to trigger long-term reward acc
elerations (measured up until the current SSA call). Bias shifts that
survive SSA represent a lifelong success history. Until the next SSA c
all, they are considered useful and build the basis for additional bia
s shifts. SSA allows for plugging in a wide variety of learning algori
thms. We plug in (1) a novel, adaptive extension of Levin search and (
2) a method for embedding the learner's policy modification strategy w
ithin the policy itself (incremental self-improvement). Our inductive
transfer case studies involve complex, partially observable environmen
ts where traditional reinforcement learning fails.