Preliminary implementation of new data mining techniques for the analysis of simulation data from genetic analysis workshop 12: Problem 2

Citation
P. Flodman et al., Preliminary implementation of new data mining techniques for the analysis of simulation data from genetic analysis workshop 12: Problem 2, GENET EPID, 21, 2001, pp. S390-S395
Citations number
2
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENETIC EPIDEMIOLOGY
ISSN journal
07410395 → ACNP
Volume
21
Year of publication
2001
Supplement
1
Pages
S390 - S395
Database
ISI
SICI code
0741-0395(2001)21:<S390:PIONDM>2.0.ZU;2-U
Abstract
We introduce a new data mining method applicable to complex disease genetic s. Our approach is suited to a broad spectrum of diseases, identifying the noteworthy sharing of combinations of alleles in unrelated affected individ uals. Furthermore, this approach may be extended to comprise the common typ es of genotype data, including single-nucleotide polymorphisms, candidate-g ene sequences, etc. Using a method derived from data-mining computer algori thms, we analyze a data set of unrelated affected individuals chosen from t he simulated pedigrees of problem 2 of the Genetics Analysis Workshop 12. W e observe that most marker subsets containing a flanking marker for each of six or seven of the disease-gene loci yield significant numbers of individ uals manifesting substantially similar genotypes. However, initial attempts (blind to the generating model) to identify the predisposing loci have not been successful. Refining our methods so that such loci may routinely be f ound and validated is underway. ((C)) 2001 Wiley-Liss, Inc.