Estimation of Regression Models for the Mean of Repeated Outcomes under Nonignorable Nonmonotone Nonresponse

Citation
Vansteelandt, Stijn et al., Estimation of Regression Models for the Mean of Repeated Outcomes under Nonignorable Nonmonotone Nonresponse, Biometrika , 94(4), 2007, pp. 841-860
Journal title
ISSN journal
00063444
Volume
94
Issue
4
Year of publication
2007
Pages
841 - 860
Database
ACNP
SICI code
Abstract
We propose a new class of models for making inference about the mean of a vector of repeated outcomes when the outcome vector is incompletely observed in some study units and missingness is nonmonotone. Each model in our class is indexed by a set of unidentified selectionbias functions which quantify the residual association of the outcome at each occasion t and the probability that this outcome is missing after adjusting for variables observed prior to time t and for the past nonresponse pattern. In particular, selection-bias functions equal to zero encode the investigator's a priori belief that nonresponse of the next outcome does not depend on that outcome after adjusting for the observed past. We call this assumption sequential explainability. Since each model in our class is nonparametric, it fits the data perfectly well. As such, our models are ideal for conducting sensitivity analyses aimed at evaluating the impact that different degrees of departure from sequential explainability have on inference about the marginal means of interest. Although the marginal means are identified under each of our models, their estimation is not feasible in practice because it requires the auxiliary estimation of conditional expectations and probabilities given high-dimensional variables. We henceforth discuss the estimation of the marginal means under each model in our class assuming, additionally, that at each occasion either one of the following two models holds: a parametric model for the conditional probability of nonresponse given current outcomes and past recorded data or a parametric model for the conditional mean of the outcome on the nonrespondents given the past recorded data. We call the resulting procedure $2^{T}-\text{multiply}$ robust as it protects at each of the T time points against misspecification of one of these two working models, although not against simultaneous misspecification of both. We extend our proposed class of models and estimators to incorporate data configurations which include baseline covariates and a parametric model for the conditional mean of the vector of repeated outcomes given the baseline covariates.