Al. Berman et al., UNDERLYING ORDER IN PROTEIN-SEQUENCE ORGANIZATION, Proceedings of the National Academy of Sciences of the United Statesof America, 91(9), 1994, pp. 4044-4047
The idea of a possible standard modular structure of proteins has been
known since 1929 when it was introduced by Svedberg. It still remains
an idea with no quantitative confirmation of universality of such hyp
othetical organization. From a large collection of nonredundant protei
n sequences representing >100 eukaryotic and prokaryotic species, we h
ave obtained the protein sequence length distributions. Mere inspectio
n of these distributions, as well as spectral analysis, shows that 15-
30% of proteins, depending on species and sequence types, indeed appea
r to be made of sequence units with characteristic lengths of approxim
ate to 125 aa for eukaryotes and approximate to 150 aa for prokaryotes
. This underlying order in protein sequence organization is shown to b
e universal-that is, the weak regularity observed is not caused by a p
articular dominant species or protein group. Possible mechanisms are d
iscussed that may be responsible for the observed regularity, includin
g a hypothesis about the recombinational nature of such protein sequen
ce organization.