Identification and characterization of the potential promoter regions of 1031 kinds of human genes

Citation
Y. Suzuki et al., Identification and characterization of the potential promoter regions of 1031 kinds of human genes, GENOME RES, 11(5), 2001, pp. 677-684
Citations number
30
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENOME RESEARCH
ISSN journal
10889051 → ACNP
Volume
11
Issue
5
Year of publication
2001
Pages
677 - 684
Database
ISI
SICI code
1088-9051(200105)11:5<677:IACOTP>2.0.ZU;2-N
Abstract
To understand the mechanism of transcriptional regulation, it is essential to identify and characterize the promoter, which is located proximal to the mRNA start site. To identify the promoters from the large volumes of genom ic sequences, we used mRNA start sites determined by a large-scale sequenci ng of the cDNA libraries constructed by the "oligo-capping" method. We alig ned the mRNA start sites with the genomic sequences and retrieved adjacent sequences as potential promoter regions (PPRs) for 1031 genes. The PPR sequ ences were searched to determine the frequencies of major promoter elements . Among 1031 PPRs, 329 (32%) contained TATA boxes, 872 (85%) contained init iators, 999 (97%) contained CC box, and 663 (64%) contained CAAT box. Furth ermore, 493 (48%) PPRs were located in CpG islands. This frequency of CpG i slands was reduced in TATA(+)/lnr(+) PPRs and in the PPRs of ubiquitously e xpressed genes. In the PPRs of the CGM2 gene, the DRA gene, and the TM30pl genes, which showed highly colon specific expression patterns, the consensu s sequences of E boxes were commonly observed. The PPRs were also useful Fo r exploring promoter SNPs.