On the assessment of statistical significance in disease-gene discovery

Citation
Lp. Zhao et al., On the assessment of statistical significance in disease-gene discovery, AM J HU GEN, 64(6), 1999, pp. 1739-1753
Citations number
14
Categorie Soggetti
Research/Laboratory Medicine & Medical Tecnology","Molecular Biology & Genetics
Journal title
AMERICAN JOURNAL OF HUMAN GENETICS
ISSN journal
00029297 → ACNP
Volume
64
Issue
6
Year of publication
1999
Pages
1739 - 1753
Database
ISI
SICI code
0002-9297(199906)64:6<1739:OTAOSS>2.0.ZU;2-0
Abstract
One of the major challenges facing genome-scan studies to discover disease genes is the assessment of the genomewide significance. The assessment beco mes particularly challenging if the scan involves a large number of markers collected from a relatively small number of meioses. Typically, this asses sment has two objectives: to assess genomewide significance under the null hypothesis of no linkage and to evaluate true-positive and false-positive p rediction error rates under alternative hypotheses. The distinction between these goals allows one to formulate the problem in the well-established pa radigm of statistical hypothesis testing. Within this paradigm, we evaluate the traditional criterion of LOD score 3.0 and a recent suggestion of LOD score 3.6, using the Monte Carlo simulation method. The Monte Carlo experim ents show that the type I error varies with the chromosome length, with the number of markers, and also with sample sizes. For a typical setup with 50 informative meioses on 50 markers uniformly distributed on a chromosome of average length (i.e., 150 cM), the use of LOD score 3.0 entails an estimat ed chromosomewide type I error rate of .00574, leading to a genomewide sign ificance level >.05. In contrast, the corresponding type I error for LOD sc ore 3.6 is .00191, giving a genomewide significance level of slightly <.05. However, with a larger sample size and a shorter chromosome, a LOD score b etween 3.0 and 3.6 may be preferred, on the basis of proximity to the targe ted type I error. In terms of reliability, these two LOD-score criteria app ear not to have appreciable differences. These simulation experiments also identified factors that influence power and reliability, shedding light on the design of genome-scan studies.