Jh. Albert et S. Chib, BAYESIAN-ANALYSIS OF BINARY AND POLYCHOTOMOUS RESPONSE DATA, Journal of the American Statistical Association, 88(422), 1993, pp. 669-679
A vast literature in statistics, biometrics, and econometrics is conce
rned with the analysis of binary and polychotomous response data. The
classical approach fits a categorical response regression model using
maximum likelihood, and inferences about the model are based on the as
sociated asymptotic theory. The accuracy of classical confidence state
ments is questionable for small sample sizes. In this article, exact B
ayesian methods for modeling categorical response data are developed u
sing the idea of data augmentation. The general approach can be summar
ized as follows. The probit regression model for binary outcomes is se
en to have an underlying normal regression structure on latent continu
ous data. Values of the latent data can be simulated from suitable tru
ncated normal distributions. If the latent data are known, then the po
sterior distribution of the parameters can be computed using standard
results for normal linear models. Draws from this posterior are used t
o sample new latent data, and the process is iterated with Gibbs sampl
ing. This data augmentation approach provides a general framework for
analyzing binary regression models. It leads to the same simplificatio
n achieved earlier for censored regression models. Under the proposed
framework, the class of probit regression models can be enlarged by us
ing mixtures of normal distributions to model the latent data. In this
normal mixture class, one can investigate the sensitivity of the para
meter estimates to the choice of ''link function,'' which relates the
linear regression estimate to the fitted probabilities. In addition, t
his approach allows one to easily fit Bayesian hierarchical models. On
e specific model considered here reflects the belief that the vector o
f regression coefficients lies on a smaller dimension linear subspace.
The methods can also be generalized to multinomial response models wi
th J > 2 categories. In the ordered multinomial model, the J categorie
s are ordered and a model is written linking the cumulative response p
robabilities with the linear regression structure. In the unordered mu
ltinomial model, the latent variables have a multivariate normal distr
ibution with unknown variance-covariance matrix. For both multinomial
models, the data augmentation method combined with Gibbs sampling is o
utlined. This approach is especially attractive for the multivariate p
robit model, where calculating the likelihood can be difficult.