Since 1929 the concept that proteins are built from subunits of certai
n standard size (Svedberg 1929) has been revisited several times, each
time with a new demonstration that, indeed, there are certain preferr
ed protein sizes. According to recent estimates the overrepresented si
zes are close to multiples of 125 amino acid (aa) residues for eukaryo
tes and 150 residues for prokaryotes. To explain these preferences, a
hypothesis is suggested, and quantitatively developed, on the recombin
ational nature of this regularity. The protein-coding sequences are as
sumed to evolve at some early stage via recombinational events-inserti
ons of DNA circles of a certain optimal size. The contour lengths of t
he protein-coding DNA circles had to be simultaneously divisible by th
ree and, to minimize torsional constraint, by the DNA helical repeat.
With these two conditions satisfied, the calculated contour lengths of
the DNA circles, 250-500 base pairs (bp), turn out to correspond well
to known optimal DNA circularization sizes and to the predicted range
of the protein sequence subunit sizes: 80-170 aa residues, which cove
rs experimentally observed values. The subunit size is found to be str
ongly influenced by the helical repeat of DNA. The sizes 125 and 150 a
a are derived when the corresponding helical repeats of DNA are set wi
thin fractions of promilles from the 10.54 bp/turn value. This fits to
the experimentally estimated mean for natural mixed DNA sequences, 10
.53-10.57 bp/turn. The suggested recombinational mechanism thus not on
ly gives a qualitative explanation for the observed underlying order i
n the protein sequences but also quantitatively links the observed pro
tein sequence sizes with the optimal DNA circularization size and with
the helical repeat of DNA. It also offers a versatile molecular model
of early protein evolution by fusion and insertion of preexisting pro
teins of standard subunit sizes.