Motivation: Despite the growing volume of data on primary nucleotide sequen
ces, the regulatory regions remain a major puzzle with regard to their func
tion. Numerous recognising programs considering a diversity of properties o
f regulatory regions have been developed The system proposed here allows th
e specific contextual, conformational and physico-chemical properties to be
revealed based on analysis of extended DNA regions.
Results: The Internet-accessible computer system RegScan, designed to analy
se the extended regulatory regions of eukaryotic genes, has been developed
The computer system comprises the following software: (i) programs for clas
sification dividing a set of promoters into TATA-containing and TATA-less p
romoters and promoters with and without CpG islands; (ii) programs for cons
tructing (a) nucleotide frequency profiles, (b) sequence complexity profile
s and (c) profiles of conformational and physio-chemical properties; (iii)
the program for constructing the sets of degenerate oligonucleotide motifs
of a specified length; and (iv) the program searching for and visualising r
epeats in nucleotide sequences. The system has allowed us to demonstrate th
e following characteristic patterns of vertebrate promoter regions: the TAT
A box region is flanked by regions with all increased G+C content and incre
ased bending stiffness, the TATA box content is asymmetric and promoter reg
ions are saturated with both direct and inverted repeats.
Availability: The computer system RegScan is available via the Internet at
http://www.mgs.bionet.nsc.ru/Systems/ RegScan, http://www.cbil.upenn.edu/mg
s/systems/regscan/
Contact: bob@bionet.nsc.ru.