A tracked approach for automated NMR assignments in proteins (TATAPRO)

Citation
Hs. Atreya et al., A tracked approach for automated NMR assignments in proteins (TATAPRO), J BIOM NMR, 17(2), 2000, pp. 125-136
Citations number
36
Categorie Soggetti
Biochemistry & Biophysics
Journal title
JOURNAL OF BIOMOLECULAR NMR
ISSN journal
09252738 → ACNP
Volume
17
Issue
2
Year of publication
2000
Pages
125 - 136
Database
ISI
SICI code
0925-2738(200006)17:2<125:ATAFAN>2.0.ZU;2-C
Abstract
A novel automated approach for the sequence specific NMR assignments of H-1 (N), C-13(alpha), C-13(beta), C-13'/H-1(alpha) and N-15 spins in proteins, using triple resonance experimental data, is presented. The algorithm, TATA PRO (Tracked AuTomated Assignments in Proteins) utilizes the protein primar y sequence and peak lists from a set of triple resonance spectra which corr elate H-1(N) and N-15 chemical shifts with those of C-13(alpha), C-13(beta) and C-13'/H-1(alpha). The information derived from such correlations is us ed to create a 'master_list' consisting of all possible sets of H-1(i)N, N- 15(i), C-13(i)alpha, C-13(i)beta, C-13'(i)/H-1(i)alpha, C-13(i-1)alpha, C-1 3(i-1)beta and C-12(i-1)'/H-1(i-1)alpha chemical shifts. On the basis of an extensive statistical analysis of C-13(alpha) and C-13(beta) chemical shif t data of proteins derived from the BioMagResBank (BMRB), it is shown that the 20 amino acid residues can be grouped into eight distinct categories, e ach of which is assigned a unique two-digit code. Such a code is used to ta g individual sets of chemical shifts in the master_list and also to transla te the protein primary sequence into an array called pps_array. The program then uses the master_list to search for neighbouring partners of a given a mino acid residue along the polypeptide chain and sequentially assigns a ma ximum possible stretch of residues on either side. While doing so, each ass igned residue is tracked in an array called assig_array, with the two-digit code assigned earlier. The assig_array is then mapped onto the pps_array f or sequence specific resonance assignment. The program has been tested usin g experimental data on a calcium binding protein from Entamoeba histolytica (Eh-CaBP, 15 kDa) having substantial internal sequence homology and using published data on four other proteins in the molecular weight range of 18-4 2 kDa. In all the cases, nearly complete sequence specific resonance assign ments (> 95%) are obtained. Furthermore, the reliability of the program has been tested by deleting sets of chemical shifts randomly from the master_l ist created for the test proteins.