This article describes the use of computer-based analytical techniques
to define nuclear size, shape, and texture features. These features a
re then used to distinguish between benign and malignant breast cytolo
gy. The benign and malignant cell samples used in this study were obta
ined by fine needle aspiration (FNA) from a consecutive series of 569
patients: 212 with cancer and 357 with fibrocystic breast masses. Regi
ons of FNA preparations to be analyzed were converted by a video camer
a to computer files that were displayed on a computer monitor, Nuclei
to be analyzed were roughly outlined by an operator using a mouse. Nex
t, the computer generated a ''snake'' that precisely enclosed each des
ignated nucleus. The computer calculated 10 features for each nucleus,
The ability to correctly classify samples as benign or malignant on t
he basis of these features was determined by inductive machine learnin
g and logistic regression, Cross-validation was used to test the valid
ity of the predicted diagnosis. The logistic regression cross validate
d classification accuracy was 96.2% and the inductive machine learning
cross-validated classification accuracy was 97.5%. Our computerized s
ystem provides a probability that a sample is malignant, Should this p
robability fall between 30% and 70%, the sample is considered ''suspic
ious,'' in the same way a visually graded FNA may be termed suspicious
. All of the 128 consecutive cases obtained since the introduction of
this system were correctly diagnosed, but nine benign aspirates fell i
nto the suspicious category. Fifty-seven FNAs were obtained that had b
een visually diagnosed elsewhere by others as ''suspicious.'' Eleven (
19.3%) were similarly classified as suspicious by the computer, but 84
.8% of the remaining samples were correctly diagnosed, The methods des
cribed in this article will provide the basis for computerized systems
to diagnose breast cytology. Copyright (C) 1995 by W.B. Saunders Comp
any