USING A GENETIC ALGORITHM TO SUGGEST COMBINATORIAL LIBRARIES

Citation
Rp. Sheridan et Sk. Kearsley, USING A GENETIC ALGORITHM TO SUGGEST COMBINATORIAL LIBRARIES, Journal of chemical information and computer sciences, 35(2), 1995, pp. 310-320
Citations number
13
Categorie Soggetti
Information Science & Library Science","Computer Application, Chemistry & Engineering","Computer Science Interdisciplinary Applications",Chemistry,"Computer Science Information Systems
ISSN journal
00952338
Volume
35
Issue
2
Year of publication
1995
Pages
310 - 320
Database
ISI
SICI code
0095-2338(1995)35:2<310:UAGATS>2.0.ZU;2-I
Abstract
In combinatorial synthesis, molecules are assembled by linking chemica lly similar fragments. Since the number of available chemical fragment s often greatly exceeds the number of distinct fragments that can be u sed in one synthetic experiment, choosing a subset of fragments become s problematical. For example, only a few dozen distinct primary and se condary amines have ever been reported to have been used in constructi ng a library of peptoids (oligomers of N-substituted glycine), while t here are several thousand suitable primary and secondary amines that a re commercially available. If a combinatorial library is to be constru cted with a particular biological activity in mind, computer-based str ucture-activity methods can be used to rationally select a subset of f ragments. In principle one would computationally generate every possib le molecule as a combination of fragments, score each molecule by the Likelihood of its being active, and select those fragments that occur in high-scoring molecules. For many cases there are too many combinati ons to take this exhaustive approach, but genetic algorithms can be us ed to quickly find high-scoring molecules by sampling a small subset o f the total combinatorial space. In this paper we demonstrate how a ge netic algorithm is used to select a subset of amines for the construct ion of a tripeptoid library. We show three examples. In the first exam ple, the scoring is based on the similarity of the tripeptoids to a sp ecific tripeptoid target. Since the target itself can be generated in this example, we have an opportunity to experiment with the protocol o f our genetic algorithm. In the second example, scoring is based on th e similarity to two tetrapeptide CCK antagonists. In the third, scorin g is done by a trend vector derived from activity data on ACE inhibito rs. In all cases we show that the genetic algorithm can find, in a mod est amount of computer time, high-scoring peptoids that resemble the t argets.