Motivation: A method for recognizing the three-dimensional fold from the pr
otein amino acid sequence based on a combination of hidden Markov models (H
MMs) and secondary structure prediction was recently developed for proteins
in the Mainly-Alpha structural class. Here, this methodology is extended t
o Mainly-Beta and Alpha-Beta class proteins. Compared to other fold recogni
tion methods based on HMMs, this approach is novel in that only secondary s
tructure information is used. Each HMM is trained from known secondary stru
cture sequences of proteins having a similar fold. Secondary structure pred
iction is performed for the amino acid sequence of a query protein. The pre
dicted fold of a query protein is the fold described by the model fitting t
he predicted sequence the best.
Results: After model cross-validation, the success rare on 44 test proteins
covering the three structural classes was found to be 59%. On seven fold p
redictions performed prior to the publication of experimental structure, th
e success rate was 71%. In conclusion, this approach manages to capture imp
ortant information about the fold of a protein embedded in the length avid
arrangement of the predicted helices, strands and coils along the polypepti
de chain. When a more extensive library of HMMs representing the universe o
f known structural families is available (work in progress), the program wi
ll allow rapid screening of genomic databases and sequence annotation when
fold similarity is not detectable from the amino acid sequence.