We report a simple method for measuring the accessible conformational space
explored by an ensemble of protein structures, The method is useful for di
verse ensembles derived from molecular dynamics trajectories, molecular mod
eling, and molecular structure determinations, It can be used to examine a
wide range of time scales, The central tactic we use, which has been previo
usly employed, is to replace the true mechanical degrees of freedom of a mo
lecular system with the conformationally effective degrees of freedom as me
asured by the root-mean squared cartesian distances among all pairs of conf
ormations. Each protein conformation is treated as a point in a high dimens
ional euclidean space, In this article, we model this space in a novel way
by representing it as an N-dimensional hypercube, describable with only two
parameters: the number of dimensions and the edge length, To validate this
approach, we provide a number of elementary test cases and then use the N-
cube method for measuring the size and shape of conformational space covere
d by molecular dynamics trajectories spanning 10 orders of magnitude in tim
e. These calculations were performed on a small protein, the villin headpie
ce subdomain, exploring both the native state and the misfolded/folding reg
ime. Distinct features include single, vibrationally averaged, substate min
ima on the 0.1-1-ps time scale, thermally averaged conformational states th
at persist for 1-100 ps and transitions between these local minima on nanos
econd time scales. Large-scale refolding modes appear to become uncorrelate
d on the microsecond time scale. Associated length scales for these events
are 0.2 Angstrom for the vibrational minima; 0.5 Angstrom for the conformat
ional minima; and 1-2 Angstrom for the nanosecond events. We find that the
conformational space that is dynamically accessible during folding of villi
n has enough volume for similar to 10(9) minima of the variety that persist
for picoseconds, Molecular dynamics trajectories of the native protein and
experimentally derived solution ensembles suggest the native state to be c
omposed of similar to 10(2) of these thermally accessible minima. Thus, bas
ed on random exploration of accessible folding space alone, protein folding
for a small protein is predicted to be a milliseconds time scale event, Th
is time can be compared with the experimental folding time for villin of 10
-100 mus. One possible explanation for the 10-100-fold discrepancy is that
the slope of the "folding funnel" increases the rate 1-2 orders of magnitud
e above random exploration of substates. (C) 2001 Wiley-Liss, Inc.