Conventional statistical methods based upon single restriction fragment len
gth polymorphisms often prove inadequate in studies of genetic variation. C
ladistic analysis has been suggested as an alternative, but requires basic
assumptions that usually cannot be met. We wanted to test whether it could
be a workable approach to apply the genetic algorithm, an artificial intell
igence method, to haplotype data. The genetic algorithm creates in-computer
artificial 'individuals', all having 'genes' coding for solutions to a pro
blem. The individuals are allowed to compete and 'mate', individuals with g
enes coding for better solutions mating more often. Genes coding for good s
olutions survive through generations of the genetic algorithm. At the end o
f the run, the best solutions can be extracted. We applied the genetic algo
rithm to data consisting of cholesterol values and haplotypes made up of se
ven restriction sites at the LDL receptor locus. The persons included were
114 FH (familial hypercholesterolemia) patients and 61 normals. The genetic
algorithm found the restriction sites 1 (Sph1 in intron 6), 2 (StuI in exo
n 8), and 7 (ApaLI site in the 3' flanking region) were associated with hig
h cholesterol levels. As a validity check we used runs of the genetic algor
ithm applied to 'artificial patients', i.e. artificially generated haplotyp
es linked to artificially generated cholesterol values. This demonstrated t
he genetic algorithm consistently found the appropriate haplotype. We concl
ude that the genetic algorithm may be a useful tool for studying genetic va
riation. (C) 1999 Elsevier Science Ireland Ltd. All rights reserved.