We examine how effectively simple potential functions previously developed
can identify compatibilities between sequences and structures of proteins f
or database searches. The potential function consists of pairwise contact e
nergies, repulsive packing potentials of residues for overly dense arrangem
ent and short-range potentials for secondary structures, all of which were
estimated from statistical preferences observed in known protein structures
. Each potential energy term was modified to represent compatibilities betw
een sequences and structures for globular proteins. Pairwise contact intera
ctions in a sequence-structure alignment are evaluated in a mean field appr
oximation on the basis of probabilities of site pairs to be aligned. Gap pe
nalties are assumed to be proportional to the number of contacts at each re
sidue position, and as a result gaps will be more frequently placed on prot
ein surfaces than in cores. In addition to minimum energy alignments, we us
e probability alignments made by successively aligning site pairs in order
by pairwise alignment probabilities. The results show that the present ener
gy function and alignment method can detect well both folds compatible with
a given sequence and, inversely, sequences compatible with a given fold, a
nd yield mostly similar alignments for these two types of sequence and stru
cture pairs, Probability alignments consisting of most reliable site pairs
only can yield extremely small root mean square deviations, and including l
ess reliable pairs increases the deviations. Also, it is observed that seco
ndary structure potentials are usefully complementary to yield improved ali
gnments with this method. Remarkably, by this method some individual sequen
ce-structure pairs are detected having only 5-20% sequence identity.