We present a secondary structure prediction method based on finding si
milarities between sequence segments from the target sequence and segm
ents contained in the database of proteins with known structures, The
similarity definition is optimized using a genetic algorithm and is ba
sed on a 21 x 40 similarity matrix, comparing a target sequence with t
he sequence and burial status of the proteins from the database. The t
hree-state secondary structure prediction accuracy reaches 72.4% on a
non homologous (maximum sequence identity <25%) data set derived from
PDB and is reproduced on two independent testing sets, including the s
et of CASP2 prediction targets and a group of newly solved PDB structu
res. The prediction method was developed with simplicity and open arch
itecture in mind, allowing for an easy extension to other types of pre
dictions and to the analysis of the contributions to the local structu
re formation. For instance, the design of the prediction procedure all
ows us to trace back segments of the database that contributed to the
prediction. It can be shown that those segments came from various stru
ctural classes and that even complete exclusion of related folds from
the database does not result in a significant decrease in prediction a
ccuracy.