Three-way ROC surfaces are based on a generalization of dichotomous ROC ana
lysis to three-class diagnostic tests. The discriminatory power of three-cl
ass diagnostic tests is measured by the volume under the ROC surface. This
measure can be given a probabilistic interpretation similar to the equivale
nce of the c-index to the area under the ROC curve. This article presents a
method to calculate nonparametric estimates of the variance of the volume
under the surface using Mann-Whitney U statistics. As a simple extension of
this result, it is possible to calculate covariance estimates for the volu
me under the surface. This allows the statistical comparison of two tests u
sed for diagnostic tasks with three possible outcomes. The formulas derived
are validated on synthetic data and applied to a three-class data set of p
igmented skin lesions. It is shown that a neural network algorithm trained
on clinical data and lesion features performs better than one trained on on
ly the lesion features.