A GLOBAL TAXONOMY OF LOOPS IN GLOBULAR-PROTEINS

Citation
Jm. Kwasigroch et al., A GLOBAL TAXONOMY OF LOOPS IN GLOBULAR-PROTEINS, Journal of Molecular Biology, 259(4), 1996, pp. 855-872
Citations number
60
Categorie Soggetti
Biology
ISSN journal
00222836
Volume
259
Issue
4
Year of publication
1996
Pages
855 - 872
Database
ISI
SICI code
0022-2836(1996)259:4<855:AGTOLI>2.0.ZU;2-8
Abstract
A bank of loops from three to eight amino acid residues long has been constituted. On the basis of statistical analysis of occurrences of co nformations and residue, loops could be divided into two parts: the si de residues directly bonded to the secondary structure flanking elemen t, and the inner part. The conformations of the side residues are corr elated to the nature of their neighboring flanks, while the inner resi dues adopt conformations uncorrelated from one residue to the next; th us they are unrelated to the flanks. Two zones in the Ramachandran plo t are important: alpha(L) and beta(p). In particular, the high occurre nce of alpha(L), mainly occupied by glycine residues, is necessary to induce flexibility and thus allow loops to comply with the geometrical constraints of the flanks. An algorithm of clustering has been used t o aggregate loops of the same length within families of similar 3D str uctures. At each position in each cluster, sequence and conformational signatures have been deduced if the occurrence of a residue (or a con formation) is higher than an equiprobable distribution over all cluste rs. The result is that some positions favor particular amino acids and conformations, which are typical of a cluster although not unique. Th is is an indication of a relation between structure and sequence in lo ops. A taxonomy is proposed that classifies the various clusters. It r elies on two terms: the mean distance between the first and last C-alp ha in one cluster and, perpendicular to this line, the distance to the center of gravity of the cluster. It is noteworthy that the different ly populated clusters represented in such 2D plots can be separated. T hus, although the conformations of loops in globular proteins could co ver a continuum, it has been possible to cluster them into a limited n umber of well populated families and superfamilies. This basic feature of protein architecture could be further exploited to better predict their geometry. (C) 1996 Academic Press Limited