Selection of an optimal neural network architecture for computer-aided detection of microcalcifications - Comparison of automated optimization techniques
Mn. Gurcan et al., Selection of an optimal neural network architecture for computer-aided detection of microcalcifications - Comparison of automated optimization techniques, MED PHYS, 28(9), 2001, pp. 1937-1948
Citations number
26
Categorie Soggetti
Radiology ,Nuclear Medicine & Imaging","Medical Research Diagnosis & Treatment
Many computer-aided diagnosis (CAD) systems use neural networks (NNs) for e
ither detection or classification of abnormalities. Currently, most NNs are
"optimized" by manual search in a very limited parameter space. In this wo
rk, we evaluated the use of automated optimization methods for selecting an
optimal convolution neural network CNN) architecture. Three automated meth
ods, the steepest descent (SD). the simulated annealing (SA), and the genet
ic algorithm (GA), were compared. We used as an example the CNN that classi
fies true and false microcalcifications detected on digitized mammograms by
a prescreening algorithm. Four parameters of the CNN architecture were con
sidered for optimization, the numbers of node groups and the filter kernel
sizes in the first and second hidden layers, resulting in a search space of
432 possible architectures. The area A. under the receiver operating chara
cteristic (ROC) curve was used to design a cost function. The SA experiment
s were conducted with four different annealing schedules. Three different p
arent selection methods were compared for the GA experiments. An available
data set was split into two groups with approximately equal number of sampl
es. By using the two groups alternately for training and testing, two diffe
rent cost surfaces were evaluated. For the first cost surface, the SD metho
d was trapped in a local minimum 91% (392/432) of the time. The SA using th
e Boltzman schedule selected the best architecture after evaluating, on ave
rage, 167 architectures. The GA achieved its best performance with linearly
scaled roulette-wheel parent selection; however, it evaluated 391 differen
t architectures, on average, to find the best one. The second cost surface
contained no local minimum. For this surface, a simple SD algorithm could q
uickly find the global minimum, but the SA with the very fast reannealing s
chedule was still the most efficient. The same SA scheme, however, was trap
ped in a local minimum on the first cost surface. Our CNN study demonstrate
d that, if optimization is to be performed on a cost surface whose characte
ristics are not known a priori, it is advisable that a moderately fast algo
rithm such as a SA using a Boltzman cooling schedule be used to conduct an
efficient and thorough search, which may offer a better chance of reaching
the global minimum. (C) 2001 American Association of Physicists in Medicine
.