CROSS-VALIDATION PERFORMANCE OF MORTALITY PREDICTION MODELS

Citation
Dc. Hadorn et al., CROSS-VALIDATION PERFORMANCE OF MORTALITY PREDICTION MODELS, Statistics in medicine, 11(4), 1992, pp. 475-489
Citations number
NO
Journal title
ISSN journal
02776715
Volume
11
Issue
4
Year of publication
1992
Pages
475 - 489
Database
ISI
SICI code
0277-6715(1992)11:4<475:CPOMPM>2.0.ZU;2-J
Abstract
Mortality prediction models hold substantial promise as tools for pati ent management, quality assessment, and, perhaps, health care resource allocation planning. Yet relatively little is known about the predict ive validity of these models. We report here a comparison of the cross -validation performance of seven statistical models of patient mortali ty: (1) ordinary-least-squares (OLS) regression predicting 0/1 death s tatus six months after admission; (2) logistic regression; (3) Cox reg ression; (4-6) three unit-weight models derived from the logistic regr ession, and (7) a recursive partitioning classification technique (CAR T). We calculated the following performance statistics for each model in both a learning and test sample of patients, all of whom were drawn from a nationally representative sample of 2558 Medicare patients wit h acute myocardial infarction: overall accuracy in predicting six-mont h mortality, sensitivity and specificity rates, positive and negative predictive values, and per cent improvement in accuracy rates and erro r rates over model-free predictions (i.e., predictions that make no us e of available independent variables). We developed ROC curves based o n logistic regression, the best unit-weight model, the single best pre dictor variable, and a series of CART models generated by varying the misclassification cost specifications. In our sample, the models reduc ed model-free error rates at the patient level by 8-22 per cent in the test sample. We found that the performance of the logistic regression models was marginally superior to that of other models. The areas und er the ROC curves for the best models ranged from 0.61 to 0.63. Overal l predictive accuracy for the best models may be adequate to support a ctivities such as quality assessment that involve aggregating over lar ge groups of patients, but the extent to which these models may be app ropriately applied to patient-level resource allocation planning is le ss clear.