Selection of an optimal neural network architecture for computer-aided detection of microcalcifications - Comparison of automated optimization techniques

Citation
Mn. Gurcan et al., Selection of an optimal neural network architecture for computer-aided detection of microcalcifications - Comparison of automated optimization techniques, MED PHYS, 28(9), 2001, pp. 1937-1948
Citations number
26
Categorie Soggetti
Radiology ,Nuclear Medicine & Imaging","Medical Research Diagnosis & Treatment
Journal title
MEDICAL PHYSICS
ISSN journal
00942405 → ACNP
Volume
28
Issue
9
Year of publication
2001
Pages
1937 - 1948
Database
ISI
SICI code
0094-2405(200109)28:9<1937:SOAONN>2.0.ZU;2-2
Abstract
Many computer-aided diagnosis (CAD) systems use neural networks (NNs) for e ither detection or classification of abnormalities. Currently, most NNs are "optimized" by manual search in a very limited parameter space. In this wo rk, we evaluated the use of automated optimization methods for selecting an optimal convolution neural network CNN) architecture. Three automated meth ods, the steepest descent (SD). the simulated annealing (SA), and the genet ic algorithm (GA), were compared. We used as an example the CNN that classi fies true and false microcalcifications detected on digitized mammograms by a prescreening algorithm. Four parameters of the CNN architecture were con sidered for optimization, the numbers of node groups and the filter kernel sizes in the first and second hidden layers, resulting in a search space of 432 possible architectures. The area A. under the receiver operating chara cteristic (ROC) curve was used to design a cost function. The SA experiment s were conducted with four different annealing schedules. Three different p arent selection methods were compared for the GA experiments. An available data set was split into two groups with approximately equal number of sampl es. By using the two groups alternately for training and testing, two diffe rent cost surfaces were evaluated. For the first cost surface, the SD metho d was trapped in a local minimum 91% (392/432) of the time. The SA using th e Boltzman schedule selected the best architecture after evaluating, on ave rage, 167 architectures. The GA achieved its best performance with linearly scaled roulette-wheel parent selection; however, it evaluated 391 differen t architectures, on average, to find the best one. The second cost surface contained no local minimum. For this surface, a simple SD algorithm could q uickly find the global minimum, but the SA with the very fast reannealing s chedule was still the most efficient. The same SA scheme, however, was trap ped in a local minimum on the first cost surface. Our CNN study demonstrate d that, if optimization is to be performed on a cost surface whose characte ristics are not known a priori, it is advisable that a moderately fast algo rithm such as a SA using a Boltzman cooling schedule be used to conduct an efficient and thorough search, which may offer a better chance of reaching the global minimum. (C) 2001 American Association of Physicists in Medicine .