Three validation procedures, single evaluation set, cross-validation a
nd repeated evaluation set, were tested on near-infrared spectrometric
data to evaluate the predictive residual standard deviation and the c
omplexity of the regression model based on partial least-squares (PLS)
regression. Thirty-six combinations of response variables and predict
or variables (originating from three response variables and spectra re
corded on the same 60 samples in four laboratories with different inst
ruments) were tested. Each validation method was used with several dif
ferent percentages of objects in the evaluation sets, from very low pe
rcentages (leave-one-out) to 33%. The results show that the frequently
used technique of the single evaluation set gives a bad estimate both
of the residual standard deviation and of the complexity of PLS model
. Cross-validation gives acceptable estimates when at least ten cancel
lation groups are used. The validation technique based on the repeated
evaluation set, with a large number of repetitions of prediction, giv
es excellent estimates of residual standard deviation and of model com
plexity, but it requires a very long computing time.