We use a three-dimensional lattice model of proteins to investigate sy
stematically the global properties of the polypeptide chains that dete
rmine the folding to the native conformation starting from an ensemble
of denatured conformations. In the coarse-grained description, the po
lypeptide chain is modeled as a heteropolymer consisting of N beads co
nfined to the vertices of a simple cubic lattice. The interactions bet
ween the beads are taken from a random gaussian distribution of energi
es, with a mean value B-o < 0 that corresponds to the overall average
hydrophobic interaction energy. We studied 56 sequences all with a uni
que ground state (native conformation) covering two values of N (15 an
d 27) and two values of B-o. The smaller value of \B-o\ was chosen so
that the average fraction of hydrophobic residues corresponds to that
found in natural proteins. The higher value of \B-o\ was selected with
the expectation that only the fully compact conformations would contr
ibute to the thermodynamic behavior. For N = 15 the entire conformatio
n space (compact as well as noncompact structures) can be exhaustively
enumerated so that the thermodynamic properties can be exactly comput
ed at all temperatures. The thermodynamic properties for the 27-mer ch
ain were calculated using the slow cooling technique together with sta
ndard Monte Carlo simulations. The kinetics of approach to the native
state for all the sequences was obtained using Monte Carlo simulations
. For all sequences we find that there are two intrinsic characteristi
c temperatures, namely, T-e and T-f. At the temperature T-e the polype
ptide chain makes a transition to a collapsed structure, while at T-f
the chain undergoes a transition to the native conformation. We show t
hat foldability of sequences can be characterized entirely in terms of
these two temperatures. It is shown that fast folding sequences have
small values of sigma = (T-e - T-f/T-e whereas slow folders have large
r values of a (the range of a is 0 < sigma < 1). The calculated values
of the folding times correlate extremely well with a. An increase in
a from 0.1 to 0.7 can result in an increase of 5-6 orders of magnitude
s in folding times. In contrast, we demonstrate that there is no usefu
l correlation between folding times and the energy gap between the nat
ive conformation and the first excited state at any N for any value of
B-o. In particular, in the parameter space of the model, many sequenc
es with varying energy gaps, all with roughly the same folding time, c
an be easily engineered. Folding sequences in this model, therefore, c
an be classified based solely on the value of sigma. Fast folders have
small values of sigma (typically less than about 0.1), moderate folde
rs have values of sigma in the approximate range between 0.1 and 0.6,
while for slow folders sigma exceeds 0.6. The precise boundary between
these categories depends crucially on N and on the model. The range o
f sigma for fast folders decreases with the length of the chain. At te
mperatures close to T-f fast folders reach the native conformation via
a native conformation nucleation collapse mechanism without forming a
ny detectable intermediates, whereas only a fraction of molecule Phi(T
) reaches the native conformation by this process for moderate folders
. The remaining fraction reaches the native state via three-stage mult
ipathway process. For slow folders Phi(T)) is close to zero at all tem
peratures. The simultaneous requirement of native state stability and
kinetic accessibility can be achieved at high enough temperatures for
those sequences with small values of sigma. The utility of these resul
ts for de novo design of proteins is briefly discussed. (C) 1996 Wiley
-Liss, Inc.