Jf. Doreleijers et al., Validation of nuclear magnetic resonance structures of proteins and nucleic acids: Hydrogen geometry and nomenclature, PROTEINS, 37(3), 1999, pp. 404-416
A statistical analysis is reported of 1,200 of the 1,404 nuclear magnetic r
esonance (MMR)derived protein and nucleic acid structures deposited in the
Protein Data Bank (PDB) before 1999. Excluded from this analysis were the e
ntries not yet fully validated by the PDB and the more than 100 entries tha
t contained < 95% of the expected hydrogens. The aim was to assess the geom
etry of the hydrogens in the remaining structures and to provide a check on
their nomenclature. Deviations in bond lengths, bond angles, improper dihe
dral angles, and planarity with respect to estimated values were checked. M
ore than 100 entries showed anomalous protonation states for some of their
amino acids. Approximately 250,000 (1.7%) atom names differed from the cons
ensus PDB nomenclature. Most of the inconsistencies are due to swapped proc
hiral labeling. Large deviations from the expected geometry exist for a con
siderable number of entries, many of which are average structures, The most
common causes for these deviations seem to be poor minimization of average
structures and an improper balance between force-field constraints for exp
erimental and holonomic data. Some specific geometric outliers are related
to the refinement programs used. A number of recommendations for biomolecul
ar databases, modeling programs, and authors submitting biomolecular struct
ures are given. Proteins 1999;37:404-416. (C) 1999 Wiley-Liss, Inc.