Single-nucleotide polymorphisms (SNPs) are the most frequently found DNA se
quence variations in the human genome. It has been argued that a dense set
of SNP markers can be used to identify genetic factors associated with comp
lex disease traits. Because all high-throughput genotyping methods require
precise sequence knowledge of the SNPs, any SNP discovery approach must inv
olve both the determination of DNA sequence and allele frequencies. Further
more, high-throughput genotyping also requires a genomic DNA amplification
step, making it necessary to develop sequence-tagged sites (STSs) that ampl
ify only the DNA fragment containing the SNP and nothing else from the rest
of the genome. In this report, we demonstrate the utility of a SNP-screeni
ng approach that yields the DNA sequence and allele frequency information w
hile screening out duplications with minimal cost and effort. Our approach
is based on the use of a homozygous complete hydatidiform mole (CHM) as the
reference. With this homozygous reference, one can identify and estimate t
he allele Frequencies of common SNPs with a pooled DNA-sequencing approach
(rather than having to sequence numerous individuals as is commonly done).
More importantly, the CHM reference is preferable to a single individual re
ference because it reveals readily any duplicated regions of the genome amp
lified by the PCR assay before the duplicated sequences are found in GenBan
k. This approach reduces the cost of SNP discovery by 60% and eliminates th
e costly development of SNP markers that cannot be amplified uniquely from
the genome.