Identifiability assumptions for missing covariate data in failure time regression models

Citation
J. Rathouz, Paul, Identifiability assumptions for missing covariate data in failure time regression models, Biostatistics (Oxford. Print) , 8(2), 2007, pp. 345-356
ISSN journal
14654644
Volume
8
Issue
2
Year of publication
2007
Pages
345 - 356
Database
ACNP
SICI code
Abstract
Methods in the literature for missing covariate data in survival models have relied on the missing at random (MAR) assumption to render regression parameters identifiable.MAR means that missingness can depend on the observed exit time, and whether or not that exit is a failure or a censoring event.By considering ways in which missingness of covariate X could depend on the true but possibly censored failure time T and the true censoring time C, we attempt to identify missingness mechanisms which would yield MAR data.We find that, under various reasonable assumptions about how missingness might depend on T and/or C, additional strong assumptions are needed to obtain MAR.We conclude that MAR is difficult to justify in practical applications. One exception arises when missingness is independent of T, and C is independent of the value of the missing X.As alternatives to MAR, we propose two new missingness assumptions.In one, the missingness depends on T but not on C; in the other, the situation is reversed. For each, we show that the failure time model is identifiable.When missingness is independent of T, we show that the naive complete record analysis will yield a consistent estimator of the failure time distribution.When missingness is independent of C, we develop a complete record likelihood function and a corresponding estimator for parametric failure time models.We propose analyses to evaluate the plausibility of either assumption in a particular data set, and illustrate the ideas using data from the literature on this problem.