Ma. Province, A single, sequential, genome-wide test to identity simultaneously all promising areas in a linkage scan, GENET EPID, 19(4), 2000, pp. 301-322
Inflation of type I error occurs when conducting a large number of statisti
cal tests in genome-wide linkage scans. Stringent or-levels protect against
the high numbers of expected false positives but at the cost of more false
negatives. A more balanced tradeoff is provided by the theory of sequentia
l analysis, which can be used in a genome scan even when the data are colle
cted using a fixed-sample design. Sequential tests allow complete, simultan
eous control of both the type I and II errors of each individual test while
using the smallest possible sample size for analysis. For fixed samples, t
he excess N "saved" can be used in a confirmatory, replication phase of the
original findings. Using the theory of sequential multiple decision proced
ures [Bechhoffer et al., 1968], we can replace the series of individual mar
ker tests with a new single, simultaneous genome-wide test that has multipl
e possible outcomes and partitions all markers into two subsets: the "signa
l" versus the "noise," with an a priori specifiable genome-wide error rate.
These tests are demonstrated for the Haseman-Elston approach, are applied
to real data, and are contrasted with traditional fixed-sampling tests in M
onte Carlo simulations of repeated genome-wide scans. The method allows eff
icient identification of the true signals in a genome scan, uses the smalle
st possible sample sizes, saves the excess to confirm those findings, contr
ols both types of error, and provides one elegant solution to the debate ov
er the best way to balance between false positives and negatives in genome
scans. Genet. Epidemiol. 19:301-322, 2000. (C) 2000 Wiley-Liss, Inc.