Large, population-based samples and large-scale genotyping are being used t
o evaluate disease/gene associations. A substantial drawback to such sample
s is the fact that population substructure can induce spurious associations
between genes and disease. We review two methods, called genomic control (
GC) and structured association (SA), that obviate many of the concerns abou
t population substructure by using the features of the genomes present in t
he sample to correct for stratification. The GC approach exploits the fact
that population substructure generates "over dispersion" of statistics used
to assess association. By testing multiple polymorphisms throughout the ge
nome, only some of which are pertinent to the disease of interest, the degr
ee of overdispersion generated by population substructure can be estimated
and taken into account. The SA approach assumes that the sampled population
, although heterogeneous, is composed of subpopulations that are themselves
homogeneous. By using multiple polymorphisms throughout the genome, this "
latent class method" estimates the probability sampled individuals derive f
rom each of these latent subpopulations. GC has the advantage of robustness
, simplicity, and wide applicability, even to experimental designs such as
DNA pooling. SA is a bit more complicated but has the advantage of greater
power in some realistic settings, such as admixed populations or when assoc
iation varies widely across subpopulations. It, too, is widely applicable.
Both also have weaknesses, as elaborated in our review. (C) 2001 Wiley-Liss
, Inc.