Background: It has been observed that single-domain proteins and domai
ns in multidomain proteins favor a chain length in the range 100-150 a
mino acids. To understand the origin of the favored size, we construct
an empirical function for the free energy of unfolding versus the cha
in length, The parameters in the function are derived by fitting to th
e energy of hydration, entropy and enthalpy of unfolding of nine prote
ins. Our energy function cannot be used to calculate the energetics ac
curately for individual proteins because the energetics also depend on
other factors, such as the composition and the conformation of the pr
otein. Nevertheless, the energy function statistically characterizes t
he general relationship between the free energy of unfolding and the s
ize of the protein. Results: The predicted optimal number of residues,
which corresponds to the maximum free energy of unfolding, is 100. Th
is is in agreement with a statistical analysis of protein domains deri
ved from their experimental structures. When a chain is too short, our
energy function indicates that the change in enthalpy of internal int
eractions is not favorable enough for folding because of the limited n
umber of inter-residue contacts. A long chain is also unfavorable for
a single domain because the cost of configurational entropy increases
quadratically as a function of the chain length, whereas the favorable
change in enthalpy of internal interactions increases linearly. Concl
usions: Our study shows that the energetic balance is the dominant fac
tor governing protein sizes and it forces a large protein to break int
o several domains during folding.