Classifier design for computer-aided diagnosis: Effects of finite sample size on the mean performance of classical and neural network classifiers

Citation
Hp. Chan et al., Classifier design for computer-aided diagnosis: Effects of finite sample size on the mean performance of classical and neural network classifiers, MED PHYS, 26(12), 1999, pp. 2654-2668
Citations number
17
Categorie Soggetti
Radiology ,Nuclear Medicine & Imaging","Medical Research Diagnosis & Treatment
Journal title
MEDICAL PHYSICS
ISSN journal
00942405 → ACNP
Volume
26
Issue
12
Year of publication
1999
Pages
2654 - 2668
Database
ISI
SICI code
0094-2405(199912)26:12<2654:CDFCDE>2.0.ZU;2-8
Abstract
Classifier design is one of the key steps in the development of computer-ai ded diagnosis (CAD) algorithms. A classifier is designed with case samples drawn from the patient population. Generally, the sample size available for classifier design is limited, which introduces variance and bias into the performance of the trained classifier, relative to that obtained with an in finite sample size. For CAD applications, a commonly used performance index for a classifier is the area, A(z), under the receiver operating character istic (ROC) curve. We have conducted a computer simulation study to investi gate the dependence of the mean performance, in terms of A(z), on design sa mple size for a linear discriminant and two nonlinear classifiers, the quad ratic discriminant and the backpropagation neural network (ANN). The perfor mances of the classifiers were compared for four types of class distributio ns that have specific properties: multivariate normal distributions with eq ual covariance matrices and unequal means, unequal covariance matrices and unequal means, and unequal covariance matrices and equal means, and a featu re space where the two classes were uniformly distributed in disjoint check erboard regions. We evaluated the performances of the classifiers in featur e spaces of dimensionality ranging from 3 to 15, and design sample sizes fr om 20 to 800 per class. The dependence of the resubstitution and hold-out p erformance on design (training) sample size (N-t) was investigated. For mul tivariate normal class distributions with equal covariance matrices, the li near discriminant is the optimal classifier. It was found that its A(z)-ver sus-1/N-t curves can be closely approximated by linear dependences over the range of sample sizes studied. In the feature spaces with unequal covarian ce matrices where the quadratic discriminant is optimal, the linear discrim inant is inferior to the quadratic discriminant or the ANN when the design sample size is large. However, when the design sample is small, a relativel y simple classifier, such as the linear discriminant or an ANN with very fe w hidden nodes, may be preferred because performance bias increases with th e complexity of the classifier. In the regime where the classifier performa nce is dominated by the 1/N-t term, the performance in the limit of infinit e sample size can be estimated as the intercept (1/N-t = 0) of a linear reg ression of A(z) versus 1/N-t. The understanding of the performance of the c lassifiers under the constraint of a finite design sample size is expected to facilitate the selection of a proper classifier for a given classificati on task and the design of an efficient resampling scheme. (C) 1999 American Association of Physicists in Medicine. [S0094-2405(99)00212-6].