ITA
ENG

THE EFFECT OF DISEASE-PREVALENCE ADJUSTMENTS ON THE ACCURACY OF A LOGISTIC PREDICTION MODEL

Authors

MORISE AP DIAMOND GA DETRANO R BOBBIO M GUNEL E

Citation

Ap. Morise et al., THE EFFECT OF DISEASE-PREVALENCE ADJUSTMENTS ON THE ACCURACY OF A LOGISTIC PREDICTION MODEL, Medical decision making, 16(2), 1996, pp. 133-142

Citations number

Categorie Soggetti

Medical Informatics

Journal title

Medical decision making → ACNP

ISSN journal

0272989X

Volume

Issue

Year of publication

1996

Pages

133 - 142

Database

ISI

SICI code

0272-989X(1996)16:2<133:TEODAO>2.0.ZU;2-Q

Abstract

The accuracy of a logistic prediction model is degraded when it is tra nsported to populations with outcome prevalences different from that o f the population used to derive the model. The resultant errors can ha ve major clinical implications. Accordingly, the authors developed a l ogistic prediction model with respect to the noninvasive diagnosis of coronary disease based on 1,824 patients who underwent exercise testin g and coronary angiography, varied the prevalence of disease in variou s ''test'' populations by random sampling of the original ''derivation '' population, and determined the accuracy of the logistic prediction model before and after the application of a mathematical algorithm des igned to adjust only for these differences in prevalence. The accuracy of each prediction model was quantified in terms of receiver operatin g characteristic (ROC) curve area (discrimination) and chi-square good ness-of-fit (calibration). As the prevalence of the test population di verged from the prevalence of the derivation population, discriminatio n improved (ROC-curve areas increased from 0.82 +/- 0.02 to 0.87 +/- 0 .03; p < 0.05), and calibration deteriorated (chi-square goodness-of-f it statistics increased from 9 to 154; p < 0.05). Following adjustment of the logistic intercept for differences in prevalence, discriminati on was unchanged and calibration improved (maximum chi-square goodness -of-fit fell from 154 to 16). When the adjusted algorithm was applied to three geographically remote populations with prevalences that diffe red from that of the derivation population, calibration improved 87%, while discrimination fell by 1%. Thus, prevalence differences produce statistically significant and potentially clinically important errors in the accuracy of logistic prediction models. These errors can potent ially be mitigated by use of a relatively simple mathematical correcti on algorithm.