As. Whittemore et J. Halpern, Problems in the definition, interpretation, and evaluation of genetic heterogeneity, AM J HU GEN, 68(2), 2001, pp. 457-465
Citations number
7
Categorie Soggetti
Research/Laboratory Medicine & Medical Tecnology","Molecular Biology & Genetics
Suppose that we wish to classify families with multiple cases of disease in
to one of three categories: those that segregate mutations of a gene of int
erest, those which segregate mutations of other genes, and those whose dise
ase is due to nonhereditary factors or chance. Among families in the first
two categories (the hereditary families), we wish to estimate the proportio
n, p, of families that segregate mutations of the gene of interest. Althoug
h this proportion is a commonly accepted concept, it is well defined only w
ith an unambiguous definition of "family." Even then, extraneous factors su
ch as family sizes and structures can cause p to vary across different popu
lations and, within a population, to be estimated differently by different
studies. Restrictive assumptions about the disease are needed, in order to
avoid this undesirable variation. The assumptions require that mutations of
all disease-causing genes (i) have no effect on family size, (ii) have ver
y low frequencies, and (iii) have penetrances that satisfy certain constrai
nts. Despite the unverifiability of these assumptions, linkage studies ofte
n invoke them to estimate p, using the admixture likelihood introduced by S
mith and discussed by Ott. We argue against this common practice, because (
1) it also requires the stronger assumption of equal penetrances for all et
iologically relevant genes; (2) even if all assumptions are met, estimates
of p are sensitive to misspecification of the unknown phenocopy rate; (3) e
ven if all the necessary assumptions are met and the phenocopy rate is corr
ectly specified, estimates of p that are obtained by linkage programs such
as HOMOG and GENEHUNTER are based on the wrong likelihood and therefore are
biased in the presence of phenocopies. We show how to correct these estima
tes; but, nevertheless, we do not recommend the use of parametric heterogen
eity models in linkage analysis, even merely as a tool for increasing the s
tatistical power to detect linkage. This is because the assumptions require
d by these models cannot be verified, and their violation could actually de
crease power. Instead, we suggest that estimation of p be postponed until t
he relevant genes have been identified. Then their frequencies and penetran
ces can be estimated on the basis of population-based samples and can be us
ed to obtain more-robust estimates of p for specific populations.