Predicting gene regulatory elements in silico on a genomic scale

Citation
A. Brazma et al., Predicting gene regulatory elements in silico on a genomic scale, GENOME RES, 8(11), 1998, pp. 1202-1215
Citations number
35
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENOME RESEARCH
ISSN journal
10889051 → ACNP
Volume
8
Issue
11
Year of publication
1998
Pages
1202 - 1215
Database
ISI
SICI code
1054-9803(199811)8:11<1202:PGREIS>2.0.ZU;2-K
Abstract
We performed a systematic analysis of gene upstream regions in the yeast ge nome for occurrences of regular expression-type patterns with the goal of i dentifying potential regulatory elements. To achieve this goal, we have dev eloped a new sequence pattern discovery algorithm that searches exhaustivel y for a priori unknown regular expression-type patterns that are over-repre sented in a given set of sequences. We applied the algorithm in two cases, (1) discovery of patterns in the complete set of >6000 sequences taken upst ream of the putative yeast genes and (2) discovery of patterns in the regio ns upstream of the genes with similar expression profiles. In the first cas e, we looked for patterns that occur more frequently in the gene upstream r egions than in the genome overall. In the second case, first we clustered t he upstream regions of all the genes by similarity of their expression prof iles on the basis of publicly available gene expression data and then looke d for sequence patterns that are over-represented in each cluster. In both cases we considered each pattern that occurred at least in some minimum num ber of sequences, and rated them on the basis of their over-representation. Among the highest rating patterns, most have matches to substrings in know n yeast transcription Factor-binding sites. Moreover, several of them are k nown to be relevant to the expression of the genes from the respective clus ters. Experiments on simulated data show that the majority of the discovere d patterns are not expected to occur by chance.