Multiple imputation in practice: Comparison of software packages for regression models with missing variables

Citation
Nj. Horton et Sr. Lipsitz, Multiple imputation in practice: Comparison of software packages for regression models with missing variables, AM STATISTN, 55(3), 2001, pp. 244-254
Citations number
41
Categorie Soggetti
Mathematics
Journal title
AMERICAN STATISTICIAN
ISSN journal
00031305 → ACNP
Volume
55
Issue
3
Year of publication
2001
Pages
244 - 254
Database
ISI
SICI code
0003-1305(200108)55:3<244:MIIPCO>2.0.ZU;2-I
Abstract
Missing data frequently complicates data analysis for scientific investigat ions. The development of statistical methods to address missing data has be en an active area of research in recent decades. Multiple imputation, origi nally proposed by Rubin in a public use dataset setting, is a general purpo se method for analyzing datasets with missing data that is broadly applicab le to a variety of missing data settings. We review multiple imputation as an analytic strategy for missing data. We describe and evaluate a number of software packages that implement this procedure, and contrast the interfac e, features, and results. We compare the packages, and detail shortcomings and useful features. The comparisons are illustrated using examples from an artificial dataset and a study of child psychopathology. We suggest additi onal features as well as discuss limitations and cautions to consider when using multiple imputation as an analytic strategy for incomplete data setti ngs.