Ss. Cross et al., EVALUATION OF A STATISTICALLY DERIVED DECISION TREE FOR THE CYTODIAGNOSIS OF FINE-NEEDLE ASPIRATES OF THE BREAST (FNAB), Cytopathology, 9(3), 1998, pp. 178-187
A decision tree for the diagnosis of FNAB was derived from defined hum
an observations using a rule induction method, C4.5 (a derivative of t
he ID3 algorithm). This algorithm is an implementation of the top-down
induction method where the tree is determined iteratively by adding t
hose nodes and branches which maximize the information gain at each st
ep. The tree was derived from a training set of 200 FNAB with known ou
tcome using 10 defined features (from one observer) and patient age. T
he tree contained a total of seven nodes (six observable features and
patient age) with eight endpoints (four benign, four malignant). The t
ree was applied to a test set of 400 further FNAB with observations fr
om the training observer and produced a sensitivity of 95%, specificit
y of 93% and a positive predictive value (PPV) of a malignant result o
f 89%. Four trainee pathologists were given a training session on the
observable features and then used the tree to determine outcome in a f
urther 50 FNAB. The observers were blind to clinical details apart fro
m age and the endpoints were coded with letters and not labelled benig
n or malignant. The results from these observers produced ranges of se
nsitivity 80-96%, specificity 64-92%, PPV 73-92% and K statistics (wit
h known outcome) 0.6-0.8. Reported difficulties in using the tree incl
uded estimation of nuclear size. These results were worse than the per
formance of the observers on a further 50 cases without using the deci
sion tree (sensitivity 80-100%, specificity 72-100%, PPV 78-100%, kapp
a 0.72-0.92). The original 50 case test set was rerandomized and the f
our trainee observers made all 10 defined observations on each specime
n without using the decision tree; these observations were then used t
o derive decisions from the tree. The performance from this method was
similar to that using selected features from the tree, suggesting tha
t observation of all features together does not improve the reliabilit
y of each specific observation. The poor performance of this tree sugg
ests that this methodology may be unsuitable for producing decision su
pport aids for diagnostic or training purposes in this domain. (C) 199
8 Blackwell Science Ltd.