ITA
ENG

MAXIMIZING THE USEFULNESS OF DATA OBTAINED WITH PLANNED MISSING VALUEPATTERNS - AN APPLICATION OF MAXIMUM-LIKELIHOOD PROCEDURES

Authors

GRAHAM JW HOFER SM MACKINNON DP

Citation

Jw. Graham et al., MAXIMIZING THE USEFULNESS OF DATA OBTAINED WITH PLANNED MISSING VALUEPATTERNS - AN APPLICATION OF MAXIMUM-LIKELIHOOD PROCEDURES, Multivariate behavioral research, 31(2), 1996, pp. 197-218

Citations number

Categorie Soggetti

Social Sciences, Mathematical Methods","Psychologym Experimental","Statistic & Probability","Mathematical, Methods, Social Sciences","Statistic & Probability","Mathematics, Miscellaneous

Journal title

Multivariate behavioral research → ACNP

ISSN journal

00273171

Volume

Issue

Year of publication

1996

Pages

197 - 218

Database

ISI

SICI code

0027-3171(1996)31:2<197:MTUODO>2.0.ZU;2-J

Abstract

Researchers often face a dilemma: Should they collect little data and emphasize quality, or much data at the expense of quality? The utility of the 3-form design coupled with maximum likelihood methods for esti mation of missing values was evaluated. In 3-form design surveys, four sets of items, X, A, B, and C are administered: Each third of the sub jects receives X and one combination of two other item sets - AB, BC, or AC. Variances and covariances were estimated with pairwise deletion , mean replacement, single imputation, multiple imputation, raw data m aximum likelihood, multiple-group covariance structure modeling, and E xpectation-Maximization (EM) algorithm estimation. The simulation demo nstrated that maximum likelihood estimation and multiple imputation me thods produce the most efficient and least biased estimates of varianc es and covariances for normally distributed and slightly skewed data w hen data are missing completely at random (MCAR). Pairwise deletion pr ovided equally unbiased estimates but was less efficient than ML proce dures. Further simulation results demonstrated that non-maximum likeli hood methods break down when: data are not missing completely at rando m. Application of these methods with empirical drug use data resulted in similar covariance matrices for pairwise and EM estimation, however , ML estimation produced better and more efficient regression estimate s. Maximum likelihood estimation or multiple imputation procedures, wh ich are dow becoming more readily available, are always recommended. I n order to maximize the efficiency of the ML parameter estimates, it i s recommended that scale items be split across forms rather than being left intact within forms.