We have developed an algorithm to analyze the circular dichroism of protein
s for secondary structure. Its hallmark is tremendous flexibility in creati
ng the basis set, and it also combines the ideas of many previous workers.
We also present a new basis set containing the CD spectra of 22 proteins wi
th secondary structures from high quality X-ray diffraction data. High flex
ibility is obtained by doing the analysis with a variable selection basis s
et of only eight proteins. Many variable selection basis sets fail to give
a good analysis, but good analyses can be selected without any a priori kno
wledge by using the following criteria: (1) the sum of secondary structures
should be close to 1.0, (2) no fraction of secondary structure should be l
ess than -0.03, (3) the reconstructed CD spectrum should fit the original C
D spectrum with only a small error, and (4) the fraction of a-helix should
be similar to that obtained using all the proteins in the basis set. This a
lgorithm gives a root mean square error for the predicted secondary structu
re for the proteins in the basis set of 3.3% for alpha-helix, 2.6% for 3(10
)-helix, 4.2% for beta-strand, 4.2% for beta-turn, 2.7% for poly(L-proline)
II type 3(1)-helix, and 5.1% for other structures when compared with the X
-ray structure. (C) 1999 Wiley-Liss, Inc.