J. Vanhelden et al., EXTRACTING REGULATORY SITES FROM THE UPSTREAM REGION OF YEAST GENES BY COMPUTATIONAL ANALYSIS OF OLIGONUCLEOTIDE FREQUENCIES, Journal of Molecular Biology, 281(5), 1998, pp. 827-842
We present here a simple and fast method allowing the isolation of DNA
binding sites for transcription factors from families of coregulated
genes, with results illustrated in Saccharomyces cerevisiae. Although
conceptually simple, the algorithm proved efficient for extracting, fr
om most of the yeast regulatory families analyzed, the upstream regula
tory sequences which had been previously found by experimental analysi
s. Furthermore, putative new regulatory sites are predicted within ups
tream regions of several regulons. The method is based on the detectio
n of over-represented oligonucleotides. A specificity of this approach
is to define the statistical significance of a site based on tables o
f oligonucleotide frequencies observed in all non-coding sequences fro
m the yeast genome. In contrast with heuristic methods, this oligonucl
eotide analysis is rigorous and exhaustive. Its range of detection is
however limited to relatively simple patterns: short motifs with a hig
hly conserved core. These features seem to be shared by a good number
of regulatory sites in yeast. This, and similar methods, should be inc
reasingly required to identify unknown regulatory elements within the
numerous new coregulated families resulting from measurements of gene
expression levels at the genomic scale. All tools described here are a
vailable on the web at the site http:// pan.cifn.unam.mx/Computational
_Biology/yeast-tools (C) 1998 Academic Press.