A primary and a secondary neural network are applied to secondary structure
and structural class prediction for a database of 681 non-homologous prote
in chains. A new method of decoding the outputs of the secondary structure
prediction network is used to produce an estimate of the probability of fin
ding each type of secondary structure at every position in the sequence, In
addition to providing a reliable estimate of the accuracy of the predictio
ns, this method gives a more accurate Q(3) (74.6%) than the cutoff method w
hich is commonly used. Use of these predictions in jury methods improves th
e Q(3) to 74.8%, the best available at present. On a database of 126 protei
ns commonly used for comparison of prediction methods, the jury predictions
are 76.6% accurate. An estimate of the overall Q(3) for a given sequence i
s made by averaging the estimated accuracy of the prediction over all resid
ues in the sequence. As an example, the analysis is applied to the target b
eta-cryptogein, which was a difficult target for ab initio predictions in t
he CASP2 study; it shows that the prediction made with the present method (
62% of residues correct) is close to the expected accuracy (66%) for this p
rotein. The larger database and use of a new network training protocol also
improve structural class prediction accuracy to 86%, relative to 80% obtai
ned previously. Secondary structure content is predicted with accuracy comp
arable to that obtained with spectroscopic methods, such as vibrational or
electronic circular dichroism and Fourier transform infrared spectroscopy.
(C) 1999 Wiley-Liss, Inc.