Fm. Rodriguez et Jaf. Diniz, HIERARCHICAL STRUCTURE OF GENETIC DISTANCES - EFFECTS OF MATRIX SIZE,SPATIAL-DISTRIBUTION AND CORRELATION STRUCTURE AMONG GENE-FREQUENCIES, GENETICS AND MOLECULAR BIOLOGY, 21(2), 1998, pp. 233-240
Geographic structure of genetic distances among local populations with
in species, based on allozyme data, has usually been evaluated by esti
mating genetic distances clustered with hierarchical algorithms, such
as the unweighted pair-group method by arithmetic averages (UPGMA). Th
e distortion produced in the clustering process is estimated by the co
phenetic correlation coefficient. This hierarchical approach, however,
can fail to produce an accurate representation of genetic distances a
mong populations in a low dimensional space, especially when continuou
s (clinal) or reticulate patterns of variation exist. In the present s
tudy, we analyzed 50 genetic distance matrices from the literature, fo
r animal taxa ranging from Platyhelminthes to Mammalia, in order to de
termine in which situations the UPGMA is useful to understand patterns
of genetic variation among populations. The cophenetic correlation co
efficients, derived from UPGMA based on three types of genetic distanc
e coefficients, were correlated with other parameters of each matrix,
including number of populations, loci, alleles, maximum geographic dis
tance among populations, relative magnitude of the first eigenvalue of
covariance matrix among alleles and logarithm of body size. Most coph
enetic correlations were higher than 0.80, and the highest values appe
ared for Nei's and Rogers' genetic distances. The relationship between
cophenetic correlation coefficients and the other parameters analyzed
was defined by an ''envelope space'', forming triangles in which high
er values of cophenetic correlations are found for higher values in th
e parameters, though low values do not necessarily correspond to high
cophenetic correlations. We concluded that UPGMA is useful to describe
genetic distances based on large distance matrices (both in terms of
elevated number of populations or alleles), when dimensionality of the
system is low (matrices with large first eigenvalues) or when local p
opulations are separated by large geographical distances.