Automatic discovery of regulatory patterns in promoter regions based on whole cell expression data and functional annotation

Citation
Lj. Jensen et S. Knudsen, Automatic discovery of regulatory patterns in promoter regions based on whole cell expression data and functional annotation, BIOINFORMAT, 16(4), 2000, pp. 326-333
Citations number
26
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
16
Issue
4
Year of publication
2000
Pages
326 - 333
Database
ISI
SICI code
1367-4803(200004)16:4<326:ADORPI>2.0.ZU;2-Q
Abstract
Motivation: The whole genomes submitted to GenBank contain valuable informa tion about the function of genes as well as the upstream sequences and whol e cell expression provides valuable information on gene regulation. To util ize these large amounts of data for a biological understanding of the regul ation of gene expression, new automatic methods for pattern finding are nee ded. Results: Two word-analysis algorithms for automatic discovery of regulatory sequence elements have been developed. We show that sequence patterns corr elated to whole cell expression data can be found using Kolmogorov-Smirnov tests on the raw data, thereby eliminating the need for clustering co-regul ated genes. Regulatory elements have also been identified by systematic cal culations of the significance of correlations between words found in the fu nctional annotation of genes and DNA words occuring in their promoter regio ns. Application of these algorithms to the Saccharomyces cerevisiae genome and publicly available DNA array data sets revealed a highly conserved 9-me r occuring in the upstream regions of genes coding for proteasomal subunits . Several other putative and known regulatory elements were also found.