B. Lebaron et As. Weigend, A BOOTSTRAP EVALUATION OF THE EFFECT OF DATA SPLITTING ON FINANCIAL TIME-SERIES, IEEE transactions on neural networks, 9(1), 1998, pp. 213-220
This letter exposes problems of the commonly used technique of splitti
ng the available data into training, validation, and test sets that ar
e held fixed, warns about drawing too strong conclusions from such sta
tic splits, and shows potential pitfalls of ignoring variability acros
s splits. Using a bootstrap or resampling method, we compare the uncer
tainty in the solution stemming from the data splitting with neural-ne
twork specific uncertainties (parameter initialization, choice of numb
er of hidden units, etc.). We present two results on data from the New
York Stock Exchange. First, the variation due to different resampling
s is significantly larger than the variation due to different network
conditions. This result implies that it is important to not over-inter
pret a model (or an ensemble of models) estimated on one specific spli
t of the data. Second, on each split, the neural-network solution with
early stopping is very close to a linear model; no significant nonlin
earities are extracted.