M. Forina et al., Confidence intervals of the prediction ability and performance scores of classifications methods, CHEM INTELL, 57(2), 2001, pp. 121-132
Chemometricians widely use multivariate classification and class-modeling t
echniques in identity problems, in multivariate quality control, to obtain
homogeneous set for multivariate calibration. The performance of such techn
iques is measured by parameters related to the prediction rate, the percent
age of correct classifications of the objects in the evaluation set, i.e.,
the objects not used to develop the classification model. Sometimes, the pr
ediction rate is discussed with reference to the no-model rate. In this pap
er, we suggest to compute a confidence interval for the prediction rate and
the derived quantities, as people usually do in the case of measured chemi
cal quantities. The statistical bases of this confidence interval are prese
nted, with an alternative estimate other than those previously known. Table
s with the limits of the confidence intervals for some usual cases (two or
three categories) are reported.
To take into account the effect of the number of samples used to evaluate t
he prediction ability, a classification performance score is introduced, wh
ose value is zero when the classification rate is compatible with the hypot
hesis of random assignment. (C) 2001 Elsevier Science B.V. All rights reser
ved.