Simulated data for a complex genetic trait (problem 2 for GAW11): How the model was developed, and why

Citation
Da. Greenberg et al., Simulated data for a complex genetic trait (problem 2 for GAW11): How the model was developed, and why, GENET EPID, 17, 1999, pp. S449-S459
Citations number
8
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENETIC EPIDEMIOLOGY
ISSN journal
07410395 → ACNP
Volume
17
Year of publication
1999
Supplement
1
Pages
S449 - S459
Database
ISI
SICI code
0741-0395(1999)17:<S449:SDFACG>2.0.ZU;2-F
Abstract
This paper describes a simulated data set created as Problem 2 for GAW11. T he generating model for Problem 2 involved two different genetic diseases, or "types," in three separate populations. The two-locus (2L) type results from the epistatic interaction of two genetic loci, and the three-allele ty pe, from a single locus with two disease-causing alleles and one normal all ele. Each type has two phenotypic forms: Mild and Severe. Both forms are su bject to both genetic and environmental influences. The disease occurs in t hree different hypothetical populations, each with different disease allele frequencies and penetrances. In two populations there is also a fourth loc us with an allele that is associated with the 2L type. Misdiagnosis can occ ur, but only after a family has already been ascertained through greater th an or equal to 2 "genetically" affected offspring. Finally, the three diffe rent populations are studied by four different hypothetical research groups . These groups each have their own ideas about how the disease is inherited and have therefore devised different ascertainment schemes based on those beliefs. Each research group collected 100-family data sets, including data on 300 markers on six chromosomes and measurements on disease status and o n the proposed two environmental factors. GAW participants were supplied wi th 25 random replicates of each data set. (C) 1999 Wiley-Liss, Inc.