The objectives in this study were to design and test a fully automated meth
od for classification of microcalcification clusters into malignant and ben
ign types, and to compare the method's performance with that of radiologist
s. A novel aspect of the approach is that the relative location and orienta
tion of clusters inside the breast was taken into account for feature calcu
lation. Furthermore, correspondence of location of clusters in mediolateral
oblique (MLO) and cranio-caudal (CC) views, was used in feature calculatio
n and in final classification. Initially, microcalcifications were automati
cally detected by using a statistical method based on Bayesian techniques a
nd a Markov random field model. To determine malignancy or benignancy of a
cluster, a method based on two classification steps was developed. In the f
irst step, classification of clusters was performed and in the second step
a patient based classification was done. A total of 16 features was used in
the study. To identify meaningful features, a feature selection was applie
d, using the area under the receiver operating characteristic (ROC) curve (
A(z) value) as a criterion. For classification the k-nearest-neighbor metho
d was used in a leave-one-patient-out procedure. A database of 192 mammogra
ms with 280 true positive detected microcalcification clusters was used for
evaluation of the method. The set consisted of cases that were selected fo
r diagnostic work up during a 4 year period of screening in the Nijmegen re
gion (The Netherlands). Because of the high positive predictive value in th
e screening program (50%), this set did not contain obvious benign cases. T
he method's best patient-based performance on this set corresponded with A(
z) = 0.83, using nine features. A subset of the data set, containing mammog
rams from 90 patients, was used for comparing the computer results to radio
logists' performance. Ten radiologists read these cases on a light-box and
assessed the probability of malignancy for each patient. All participants h
ad experience in clinical mammography and participated in our observer stud
y during the last 2 days of a 2-week training session leading to screening
mammography certification. Results on the subset showed that the method's p
erformance (A(z) = 0.83) was considerably higher than that of the radiologi
sts (A(z) = 0.63). (C) 2000 American Association of Physicists in Medicine.
[S0094-2405(00)01011-7].