B. Sahiner et al., DESIGN OF A HIGH-SENSITIVITY CLASSIFIER BASED ON A GENETIC ALGORITHM - APPLICATION TO COMPUTER-AIDED DIAGNOSIS, Physics in medicine and biology (Print), 43(10), 1998, pp. 2853-2871
A genetic algorithm (GA) based feature selection method was developed
for the design of high-sensitivity classifiers, which were tailored to
yield high sensitivity with high specificity. The fitness function of
the GA was based on the receiver operating characteristic (ROC) parti
al area index, which is defined as the average specificity above a giv
en sensitivity threshold. The designed GA evolved towards the selectio
n of feature combinations which yielded high specificity in the high-s
ensitivity region of the ROC curve, regardless of the performance at l
ow sensitivity. This is a desirable quality of a classifier used for b
reast lesion characterization, since the focus in breast lesion charac
terization is to diagnose correctly as many benign lesions as possible
without missing malignancies. The high-sensitivity classifier, formul
ated as the Fisher's linear discriminant using GA-selected feature var
iables, was employed to classify 255 biopsy-proven mammographic masses
as malignant or benign. The mammograms were digitized at a pixel size
of 0.1 mm x 0.1 mm, and regions of interest (ROIs) containing the bio
psied masses were extracted by an experienced radiologist. A recently
developed image transformation technique, referred to as the rubber-ba
nd straightening transform, was applied to the ROIs. Texture features
extracted from the spatial grey-level dependence and run-length statis
tics matrices of the transformed ROIs were used to distinguish maligna
nt and benign masses. The classification accuracy of the high-sensitiv
ity classifier was compared with that of linear discriminant analysis
with stepwise feature selection (LDA,a). With proper GA training, the
ROC partial area of the high-sensitivity classifier above a true-posit
ive fraction of 0.95 was significantly larger than that of LDA(sfs), a
lthough the latter provided a higher total area (A(z)) under the ROC c
urve. By setting an appropriate decision threshold, the high-sensitivi
ty classifier and LDA(sfs) correctly identified 61% and 34% of the ben
ign masses respectively without missing any malignant masses. Our resu
lts show that the choice of the feature selection technique is importa
nt in computer-aided diagnosis, and that the GA may be a useful tool f
or designing classifiers for lesion characterization.