ITA
ENG

THE USE OF THE AREA UNDER THE ROC CURVE IN THE EVALUATION OF MACHINE LEARNING ALGORITHMS

Authors

BRADLEY AP

Citation

Ap. Bradley, THE USE OF THE AREA UNDER THE ROC CURVE IN THE EVALUATION OF MACHINE LEARNING ALGORITHMS, Pattern recognition, 30(7), 1997, pp. 1145-1159

Citations number

Categorie Soggetti

Computer Sciences, Special Topics","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence

Journal title

Pattern recognition → ACNP

ISSN journal

00313203

Volume

Issue

Year of publication

1997

Pages

1145 - 1159

Database

ISI

SICI code

0031-3203(1997)30:7<1145:TUOTAU>2.0.ZU;2-L

Abstract

In this paper we investigate the use of the area under the receiver op erating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine l earning algorithms (C4.5, Multiscale Classifier, Perceptron, Multi-lay er Perceptron, k-Nearest Neighbours, and a Quadratic Discriminant Func tion) on six ''real world'' medical diagnostics data sets. We compare and discuss the use of AUC to the more conventional overall accuracy a nd find that AUC exhibits a number of desirable properties when compar ed to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the num ber of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall accuracy for ''single number'' evaluation of machine learning algorithms. (C) 1997 Pattern Recognition Society.