Genome-wide analysis of single-nucleotide polymorphisms in human expressedsequences

Citation
K. Irizarry et al., Genome-wide analysis of single-nucleotide polymorphisms in human expressedsequences, NAT GENET, 26(2), 2000, pp. 233-236
Citations number
22
Categorie Soggetti
Molecular Biology & Genetics
Journal title
NATURE GENETICS
ISSN journal
10614036 → ACNP
Volume
26
Issue
2
Year of publication
2000
Pages
233 - 236
Database
ISI
SICI code
1061-4036(200010)26:2<233:GAOSPI>2.0.ZU;2-N
Abstract
Single-nucleotide polymorphisms (SNPs) have been explored as a high-resolut ion marker set for accelerating the mapping of disease genes(1-11). Here we report 48,196 candidate SNPs detected by statistical analysis of human exp ressed sequence tags (ESTs). associated primarily with coding regions of ge nes. We used Bayesian inference to weigh evidence for true polymorphism ver sus sequencing error, misalignment or ambiguity, misclustering or chimaeric EST sequences, assessing data such as raw chromatogram height, sharpness, overlap and spacing, sequencing error rates, context-sensitivity and cDNA l ibrary origin. Three separate validation-comparison with 54 genes screened for SNPs independently, verification of HLA-A polymorphisms and restriction fragment length polymorphism (RFLP) testing-verified 70%, 89% and 71% of o ur predicted SNPs, respectively. Our method detects tenfold more true HLA-A SNPs than previous analyses of the EST data. We found SNPs in a large frac tion of known disease genes, including some disease-causing mutations (for example, the HbS sickle-cell mutation). Our comprehensive analysis of human coding region polymorphism provides a public resource for mapping of disea se genes (available at http://www.bioinformatics.ucla.edu/snp).