ITA
ENG

CLASSIFICATION-ALGORITHM EVALUATION - 5 PERFORMANCE-MEASURES BASED ONCONFUSION MATRICES

Authors

FORBES AD

Citation

Ad. Forbes, CLASSIFICATION-ALGORITHM EVALUATION - 5 PERFORMANCE-MEASURES BASED ONCONFUSION MATRICES, Journal of clinical monitoring, 11(3), 1995, pp. 189-206

Citations number

Categorie Soggetti

Medical Laboratory Technology

Journal title

Journal of clinical monitoring → ACNP

ISSN journal

07481977

Volume

Issue

Year of publication

1995

Pages

189 - 206

Database

ISI

SICI code

0748-1977(1995)11:3<189:CE-5PB>2.0.ZU;2-G

Abstract

Objective. The objective of this paper is to introduce, explain, and e xtend methods for comparing the performance of classification algorith ms using error tallies obtained on properly sized, populated, and labe led data sets. Methods. Two distinct contexts of classification are de fined, involving ''objects-by-inspection'' and ''objects-by-segmentati on.'' In the former context, the total number of objects to be classif ied is unambiguously and self-evidently defined. In the latter, there is troublesome ambiguity. All five of the measures of performance here considered are based on confusion matrices, tables of counts revealin g the extent of an algorithm's ''confusion'' regarding the true classi fications. A proper measure of classification-algorithm performance mu st meet four requirements. A proper measure should obey six additional constraints. Results. Four traditional measures of performance are cr itiqued in terms of the requirements and constraints. Each measure mee ts the requirements, but fails to obey at least one of the constraints . A nontraditional measure of algorithm performance, the normalized mu tual information (NMI), is therefore introduced. Based on the NMI, met hods for comparing algorithm performance using confusion matrices are devised. Conclusions. The five performance measures lead to similar in ferences when comparing a trio of QRS-detection algorithms using a lar ge data set. The modified NMI is preferred, however, because it obeys each of the constraints and is the most conservative measure of perfor mance.