The positions of a given fold always occupied by strong hydrophobic amino a
cids (V, I, L, F, M, Y, W), which we call "topohydrophobic positions", were
detected and their properties demonstrated within 153 non-redundant famili
es of homologous domains, through 3D structural alignments. Sets of diverge
nt sequences possessing at least four to five members appear to be as infor
mative as larger sets, provided that their mean pairwise sequence identity
is low. Amino acids in topohydrophobic positions exhibit several interestin
g features: they are much more buried than their equivalents in non-topohyd
rophobic positions, their side chains are far less dispersed; and they ofte
n constitute a lattice of close contacts in the inner core of globular doma
ins. In most cases, each regular secondary structure possesses one to three
topohydrophobic positions, which cluster in the domain core. Moreover, usi
ng sensitive alignment processes such as hydrophobic cluster analysis (HCA)
, it is possible to identify topohydrophobic positions from only a small se
t of divergent sequences. Amino acids in topohydrophobic positions, which c
an be identified directly from sequences, constitute key markers of protein
folds, define long-range structural constraints, which, together with seco
ndary structure predictions, limit the number of possible conformations for
a given fold.