ITA
ENG

ESTIMATING CAUSAL EFFECTS FROM LARGE DATA SETS USING PROPENSITY SCORES

Authors

RUBIN DB

Citation

Db. Rubin, ESTIMATING CAUSAL EFFECTS FROM LARGE DATA SETS USING PROPENSITY SCORES, Annals of internal medicine, 127(8), 1997, pp. 757-763

Citations number

Categorie Soggetti

Medicine, General & Internal

Journal title

Annals of internal medicine → ACNP

ISSN journal

00034819

Volume

127

Issue

Year of publication

1997

Part

Pages

757 - 763

Database

ISI

SICI code

0003-4819(1997)127:8<757:ECEFLD>2.0.ZU;2-1

Abstract

The aim of many analyses of large databases is to draw causal inferenc es about the effects of actions, treatments, or interventions. Example s include the effects of various options available to a physician for treating a particular patient, the relative efficacies of various heal th care providers, and the consequences of implementing a new national health care policy. A complication of using large databases to achiev e such aims is that their data are almost always observational rather than experimental. That is, the data in most large data sets are not b ased on the results of carefully conducted randomized clinical trials, but rather represent data collected through the observation of system s as they operate in normal practice without any interventions impleme nted by randomized assignment rules. Such data are relatively inexpens ive to obtain, however, and often do represent the spectrum of medical practice better than the settings of randomized experiments. Conseque ntly, it is sensible to try to estimate the effects of treatments from such large data sets, even if only to help design a new randomized ex periment or shed light on the generalizability of results from existin g randomized experiments. However, standard methods of analysis using available statistical software (such as linear or logistic regression) can be deceptive for these objectives because they provide no warning s about their propriety. Propensity score methods are more reliable to ols for addressing such objectives because the assumptions needed to m ake their answers appropriate are more assessable and transparent to t he investigator.