This paper gives an expectation maximization (EM) algorithm to obtain
allele frequencies, haplotype frequencies, and gametic disequilibrium
coefficients for multiple-locus systems. It permits high polymorphism
and null alleles at all loci. This approach effectively deals with the
primary estimation problems associated with such systems; that is, th
ere is not a one-to-one correspondence between phenotypic and genotypi
c categories, and sample sizes tend to be much smaller than the number
of phenotypic categories. The EM method provides maximum-likelihood e
stimates and therefore allows hypothesis tests using likelihood ratio
statistics that have chi(2) distributions with large sample sizes. We
also suggest a data resampling approach to estimate test statistic sam
pling distributions. The resampling approach is more computer intensiv
e, but it is applicable to all sample sizes. A strategy to test hypoth
eses about aggregate groups of gametic disequilibrium coefficients is
recommended. This strategy minimizes the number of necessary hypothesi
s tests while at the same time describing the structure of disequilibr
ium. These methods are applied to three unlinked dinucleotide repeat l
oci in Navajo Indians and to three linked HLA loci in Gila River (Pima
) Indians. The likelihood functions of both data sets are shown to be
maximized by the EM estimates, and the testing strategy provides a use
ful description of the structure of gametic disequilibrium. Following
these applications, a number of simulation experiments are performed t
o test how well the likelihood-ratio statistic distributions are appro
ximated by chi(2) distributions. In most circumstances the chi(2) gros
sly underestimated the probability of type I errors. However, at times
they also overestimated the type 1 error probability. Accordingly, we
recommend hypothesis tests that use the resampling method.