A method is described to objectively identify hydrophobic clusters in
proteins of known structure. Clusters are found by examining a protein
for compact groupings of side chains. Compact clusters contain seven
or more residues, have an average of 65% hydrophobic residues, and usu
ally occur in protein interiors. Although smaller clusters contain onl
y side-chain moieties, larger clusters enclose significant portions of
the peptide backbone in regular secondary structure. These clusters a
gree well with hydrophobic regions assigned by more intuitive methods
and many larger clusters correlate with protein domains. These results
are in striking contrast with the clustering algorithm of J. Heringa
and P. Argos (1991, J Mol Biol 220: 151-171). That method finds that c
lusters located on a protein's surface are not especially hydrophobic
and average only 3-4 residues in size. Hydrophobic clusters can be cor
related with experimental evidence on early folding intermediates. Thi
s correlation is optimized when clusters with less than nine hydrophob
ic residues are removed from the data set. This suggests that hydropho
bic dusters are important in the folding process only if they have eno
ugh hydrophobic residues.