We propose a methodology for estimating the cell probabilities in a multiwa
y contingency table by combining partial information from a number of studi
es when not all of the variables are recorded in all studies. We jointly mo
del the full set of categorical variables recorded in at least one of the s
tudies, and we treat the variables that are not reported as missing dimensi
ons of the study-specific contingency table. For example, ae might be inter
ested in combining several cohort studies in which the incidence in the exp
osed and nonexposed groups is not reported for all risk factors in all stud
ies while the overall numbers of cases and cohort size is always available.
To account for study-to-study variability, we adopt a Bayesian hierarchica
l model. At the first stage of the model. the observation stage, data are m
odeled by a multinomial distribution with fixed total number of observation
s. At the second stage, we use the logistic normal (LN) distribution to mod
el variability in the studs-specific cells' probabilities. Using this model
and data augmentation techniques, we reconstruct the contingency table for
each study regardless of which dimensions are missing, and we estimate pop
ulation parameters of interest. Our hierarchical procedure harrows strength
from all the studies and accounts for correlations among the cells' probab
ilities. The main difficulty in combining studies recording different varia
bles is in maintaining a consistent interpretation of parameters across stu
dies. The approach proposed here overcomes this difficulty and at the same
time addresses the uncertainty arising from the missing dimensions. We appl
y our modeling strategy to analyze data on air pollution and mortality from
1987 to 1994 for six U.S. cities bg combining six cross-classification of
low. medium, and high levels of mortality counts, particulate matter, ozone
, and carbon monoxide with the complication that four of the six cities do
not report all the air pollution variables. Our goals are to investigate th
e association between air pollution and mortality by reconstructing the tab
les with missing dimensions, to determine the most harmful pollutant combin
ations, and to make predictions about these key issues for a city other tha
n the six sampled. We find that, for high levels of ozone and carbon monoxi
de, the number of cases with a high number of deaths increases as the level
s of particulate matter, PM10, increases arid that the most harmful combina
tions corresponds to high levels of PM10, confirming prior findings that le
vels of PM10 higher than the NAAQS standard are harmful.