Sm. Kielbasa et al., Combining frequency and positional information to predict transcription factor binding sites, BIOINFORMAT, 17(11), 2001, pp. 1019-1026
Motivation: Even though a number of genome projects have been finished on t
he sequence level, still only a small proportion of DNA regulatory elements
have been identified. Growing amounts of gene expression data provide the
possibility of finding coregulated genes by clustering methods. By analysis
of the promoter regions of those genes, rather weak signals of transcripti
on factor binding sites may be detected.
Results: We introduce the new algorithm ITB, an Integrated Tool for Box fin
ding, which combines frequency and positional information to predict transc
ription factor binding sites in upstream regions of coregulated genes. Moti
fs are extracted by exhaustive analysis of regular expression-like patterns
and by estimating probabilities of positional clusters of motifs. ITB dete
cts consensus sequences of experimentally verified transcription factor bin
ding sites of the yeast Saccharomyces cerevisiae. Moreover, a number of new
binding site candidates with significant scores are predicted. Besides app
lying ITB on yeast upstream regions, the program is run on human promoter s
equences.