Using a large database of protein structure-structure alignments, we test a
new method for distinguishing homologous and "analogous" structural neighb
ors. The homologous neighbors included in the test set show no detectable s
equence similarity but they may be well superimposed and show functional si
milarity or other evidence of evolutionary relationship. Analogous neighbor
s also show no sequence similarity and may be well superimposed, but they h
ave different functions and their structural similarity may be the result o
f convergent evolution. Confirming results of other analyses, we find that
remote homologs and analogs are not well distinguished by measures of pairw
ise structural similarity including the percentage of identical residues an
d root-mean-square (RMS) superposition residual. We show, however, that wit
h structure-structure alignments of analogous neighbors rarely superimpose
the particular substructure that is shared among homologous neighbors. We c
all this characteristic substructure the homologous core structure (HCS), a
nd we show that a cross-validated test for presence of the HCS correctly id
entifies 75% of remote homologs with a false-positive rate of 16% analogs,
significantly better than discrimination by RMS or other measures of pairwi
se similarity The HCS describes conservation of spatial structure within a
protein family in much the way that a sequence motif describes sequence con
servation. We suggest that it may be used in the same way, to identify homo
logous neighbors at greater evolutionary distance than is possible by pairw
ise comparison. Proteins 1999;35:70-79. Published 1999 Wiley-Liss, Inc.