The aim of many analyses of large databases is to draw causal inferenc
es about the effects of actions, treatments, or interventions. Example
s include the effects of various options available to a physician for
treating a particular patient, the relative efficacies of various heal
th care providers, and the consequences of implementing a new national
health care policy. A complication of using large databases to achiev
e such aims is that their data are almost always observational rather
than experimental. That is, the data in most large data sets are not b
ased on the results of carefully conducted randomized clinical trials,
but rather represent data collected through the observation of system
s as they operate in normal practice without any interventions impleme
nted by randomized assignment rules. Such data are relatively inexpens
ive to obtain, however, and often do represent the spectrum of medical
practice better than the settings of randomized experiments. Conseque
ntly, it is sensible to try to estimate the effects of treatments from
such large data sets, even if only to help design a new randomized ex
periment or shed light on the generalizability of results from existin
g randomized experiments. However, standard methods of analysis using
available statistical software (such as linear or logistic regression)
can be deceptive for these objectives because they provide no warning
s about their propriety. Propensity score methods are more reliable to
ols for addressing such objectives because the assumptions needed to m
ake their answers appropriate are more assessable and transparent to t
he investigator.