In a non-redundant set of 571 proteins from the Brookhaven Protein Data Bas
e, a total of 43 non-proline cis peptide bonds were identified. Average geo
metrical parameters of the well-defined cis peptide bonds in proteins deter
mined at high resolution show that some parameters, most notably the bond a
ngle at the amide bond nitrogen, deviate significantly from the correspondi
ng one in the trans conformation. Since the same feature was observed in ci
s amide bonds in small molecule structures found in the Cambridge Structura
l Data Base, a new set of parameters for the refinement of protein structur
es containing non-Pro cis peptide bonds is proposed.
A striking preference was observed for main-chain dihedral angles of the re
sidues involved in cis peptide bonds. All residues N-terminal and most resi
dues C-terminal to a non-Pro cis peptide bond (except Gly) are located in t
he beta-region of a phi/psi plot. Also, all of the few C-terminal residues
(except Gly) located in the alpha-region of the phi/psi plot constitute the
start of an alpha-helix in the respective structure.
In the majority of cases, an intimate side-chain/side-chain interaction was
observed between the flanking residues, often involving aromatic side-chai
ns. Interestingly, most of the cases found occur in functionally important
regions such as close to the active site of proteins. It is intriguing that
many of the proteins containing non-proline cis peptide bonds are carbohyd
rate-binding or processing proteins.
The occurrence of these unusual peptide bonds is significantly more frequen
t in structures determined at high resolution than in structures determined
at medium and low resolution, suggesting that these bonds may be more abun
dant than previously thought. On the basis of our experience with the struc
ture determination of coagulation factor XIII, we developed an algorithm fo
r the identification of possibly overlooked cis peptide bonds that exploits
the deviations of geometrical parameters from ideality. A few likely candi
dates based on our algorithm have been identified and are discussed. (C) 19
99 Academic Press.