Sa. Tishkoff et al., The accuracy of statistical methods for estimation of haplotype frequencies: An example from the CD4 locus, AM J HU GEN, 67(2), 2000, pp. 518-522
Citations number
20
Categorie Soggetti
Research/Laboratory Medicine & Medical Tecnology","Molecular Biology & Genetics
Haplotype analysis has become increasingly important for the study of human
disease as well as for reconstruction of human population histories. Compu
ter programs have been developed to estimate haplotype frequencies statisti
cally from marker phenotypes in unrelated individuals. However, there curre
ntly are few empirical reports on the accuracy of statistical estimates tha
t must infer linkage phase. We have analyzed haplotypes at the CD4 locus on
chromosome 12 that consist of a short tandem-repeat polymorphism and an Al
u insertion/deletion polymorphism located 9.8 kb apart, in 398 individuals
from 10 geographically diverse sub-Saharan African populations. Haplotype f
requency estimates obtained using gene counting based on molecularly haplot
yped (phase-known) data were compared with haplotype frequency estimates ob
tained using the expectation-maximization algorithm. We show that the estim
ated frequencies of common haplotypes do not differ significantly with the
use of phase-known versus phase-unknown data. However, rare haplotypes are
occasionally miscalled when their presence/absence must be inferred. Thus,
for those research questions for which the common haplotypes are most impor
tant, frequency estimates based on the phase-unknown marker-typing results
from unrelated individuals will be sufficient. However, in cases where know
ledge of rare haplotypes is critical, molecular haplotyping will be necessa
ry to determine linkage phase unambiguously.