Simple knowledge-based descriptors to predict protein-ligand interactions.Methodology and validation

Citation
Jwm. Nissink et al., Simple knowledge-based descriptors to predict protein-ligand interactions.Methodology and validation, J COMPUT A, 14(8), 2000, pp. 787-803
Citations number
24
Categorie Soggetti
Chemistry & Analysis
Journal title
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN
ISSN journal
0920654X → ACNP
Volume
14
Issue
8
Year of publication
2000
Pages
787 - 803
Database
ISI
SICI code
0920-654X(200011)14:8<787:SKDTPP>2.0.ZU;2-S
Abstract
A new type of shape descriptor is proposed to describe the spatial orientat ion for non-covalent interactions. It is built from simple, anisotropic Gau ssian contributions that are parameterised by 10 adjustable values. The des criptors have been used to fit propensity distributions derived from scatte r data stored in the IsoStar database. This database holds composite pictur es of possible interaction geometries between a common central group and va rious interacting moieties, as extracted from small-molecule crystal struct ures. These distributions can be related to probabilities for the occurrenc e of certain interaction geometries among different functional groups. A fi tting procedure is described that generates the descriptors in a fully auto mated way. For this purpose, we apply a similarity index that is tailored t o the problem, the Split Hodgkin Index. It accounts for the similarity in r egions of either high or low propensity in a separate way. Although depende nt on the division into these two subregions, the index is robust and perfo rms better than the regular Hodgkin index. The reliability and coverage of the fitted descriptors was assessed using SuperStar. SuperStar usually oper ates on the raw IsoStar data to calculate propensity distributions, e.g., f or a binding site in a protein. For our purpose we modified the code to hav e it operate on our descriptors instead. This resulted in a substantial red uction in calculation time (factor of five to eight) compared to the origin al implementation. A validation procedure was performed on a set of 130 pro tein-ligand complexes, using four representative interacting probes to map the properties of the various binding sites: ammonium nitrogen, alcohol oxy gen, carbonyl oxygen, and methyl carbon. The predicted `hot spots' for the binding of these probes were compared to the actual arrangement of ligand a toms in experimentally determined protein-ligand complexes. Results indicat e that the version of SuperStar that applies to our descriptors is capable to predict the above-mentioned atom types in ligands correctly with success rates of 59% and 74%, respectively, for all ligand atoms (regardless of th eir solvent accessibility), and a subset of solvent-inaccessible ones. If n ot only exact atom-type matches are counted, but also those that identify l igand atoms of similar physicochemical properties, the prediction rates ris e to 75% and 89%. These rates are close to those obtained by the original S uperStar method (being 67% and 82%, respectively, for the prediction of exa ct matching atom types, and 81% and 91% in the case of predicting similar a tom types).