We compare the asymptotic relative efficiency (ARE) of different study desi
gns for estimating gene and gene-environment interaction effects using matc
hed case-control data. In the sampling schemes considered, cases are select
ed differentially based on their family history of disease. Controls are se
lected either from unrelated subjects or from among the case's unaffected s
iblings and cousins. Parameters are estimated using weighted conditional lo
gistic regression, where the likelihood contributions for each subject are
weighted by the fraction of cases sampled sharing the same family history.
Results showed that compared to random sampling, over-sampling cases with a
positive family history increased the efficiency for estimating the main e
ffect of a gene for sib-control designs (103-254% ARE) and decreased effici
ency for cousin-control and population-control designs (68-94% ARE and 67-8
4% ARE, respectively). Population controls and random sampling of cases wer
e most efficient for a recessive gene or a dominant gene with an relative r
isk less than 9. For estimating gene-environment interactions, over-samplin
g positive-family-history cases again led to increased efficiency using sib
controls (111-180% ARE) and decreased efficiency using population controls
(68-87% ARE). Using case-cousin pairs, the results differed based on the g
enetic model and the size of the interaction effect; biased sampling was on
ly slightly more efficient than random sampling for large interaction effec
ts under a dominant gene model (relative risk ratio = 8, 106% ARE). Overall
, the most efficient study design for studying gene-environment interaction
was the case-sib-control design with over-sampling of positive-family-hist
ory-cases. (C) 2001 Wiley-Liss. Inc.