We present a novel algorithm, called Ftrees-FS, for similarity searching in
large chemistry spaces based on dynamic programming. Given a query compoun
d, the algorithm generates sets of compounds from a given chemistry space t
hat are similar to the query. The similarity search is based on the feature
tree similarity measure representing molecules by tree structures. This de
scriptor allows handling combinatorial chemistry spaces as a whole instead
of looking at subsets of enumerated compounds. Within few minutes of comput
ing time, the algorithm is able to find the most similar compound in very l
arge spaces as well as sets of compounds at an arbitrary similarity level.
In addition, the diversity among the generated compounds can be controlled.
A set of 17 000 fragments of known drugs, generated by the RECAP procedure
from the World Drug Index, was used as the search chemistry space. These f
ragments can be combined to more than 10(18) compounds of reasonable size.
For validation, known antagonists/inhibitors of several targets including d
opamine D4, histamine H1, and COX2 are used as queries. Comparison of the c
ompounds created by Ftrees-FS to other known actives demonstrates the abili
ty of the method to jump between structurally unrelated molecule classes.