A strategy is presented for protein fold recognition from secondary st
ructure assignments (alpha-helix and beta-strand). The method can dete
ct similarities between protein folds in the absence of sequence simil
arity. Secondary structure mapping first identifies all possible match
es (maps) between a query string of secondary structures and the secon
dary structures of protein domains of known three-dimensional structur
e. The maps are then passed through a series of structural filters to
remove those that do not obey simple rules of protein structure. The s
urviving maps are ranked by scores from the alignment of predicted and
experimental accessibilities. Searches made with secondary structure
assignments for a test set of 11 fold-families put the correct sequenc
e-dissimilar fold in the first rank 8/11 times. With cross-validated p
redictions of secondary structure this drops to 4/11 which compares fa
vourably with the widely used THREADER program (1/11). The structural
class is correctly predicted 10/11 times by the method in contrast to
5/11 for THREADER. The new technique obtains comparable accuracy in th
e alignment of amino acid residues and secondary structure elements. S
earches are also performed with published secondary structure predicti
ons for the von-Willebrand factor type A domain, the proteasome 20 S a
lpha subunit and the phosphotyrosine interaction domain. These searche
s demonstrate how the method can find the correct fold for a protein f
rom a carefully constructed secondary structure prediction, multiple s
equence alignment and distance restraints. Scans with experimentally d
etermined secondary structures and accessibility, recognise the correc
t fold with high alignment accuracy (86% on secondary structures). Thi
s suggests that the accuracy of mapping will improve alongside any imp
rovements in the prediction of secondary structure or accessibility. A
pplication to NMR structure determination is also discussed. (C) 1996
Academic Press Limited