T. Ojasoo et Jc. Dore, TAXONOMY OF NUCLEAR RECEPTORS AND SERPINS BY MULTIVARIATE-ANALYSIS OFAMINO-ACID-COMPOSITION, Journal of steroid biochemistry and molecular biology, 58(2), 1996, pp. 167-181
The global amino-acid composition of a protein, although a cruder vari
able than sequence, is nevertheless informative and has been correlate
d with protein structural class. In the present study, we have applied
complementary multivariate methods based on chi(2)-metrics (correspon
dence factor analysis (CFA), minimum spanning tree (MST), ascending hi
erarchical classification (AHC)) to the analysis of the amino-acid fre
quency patterns of the C-terminal domain of 39 members of the nuclear
receptor superfamily. The correlations we observed among receptors by
this simple approach were, with few exceptions, in line with published
phylogenetic dendrograms derived by sequence alignment. Further multi
variate analyses were performed on the receptor population combined wi
th 26 serine protease inhibitors (SERPINS) in view of the analogies de
tected between these superfamilies by hydrophobic cluster analysis (HC
A), which were at the origin of the choice of alpha 1-antitrypsin as a
3-dimensional (3D) model for the receptor hormone-binding domain. Bot
h the MST and AHC identified two distinct protein populations which in
the principal phi(1) phi(2) CFA plot showed virtually mo overlap, thu
s suggesting that receptors and SERPINS have different overall folding
patterns, although the lower-order phi(3) phi(4) plot did reveal some
similarities, essentially in the use of hydrophobic amino acids, that
might account for analogies in HCA patterns. Receptors had a preferen
ce for those amino acids that are more frequent in alpha-helices and S
ERPINS for those in beta-strands and also tended to use different amin
o acids in turns. We therefore propose that multivariate analysis of a
mino-acid composition may prove helpful in identifying proteins for su
bsequent HCA. Copyright (C) 1996 Elsevier Science Ltd.