Rd. Clark et Wj. Langton, BALANCING REPRESENTATIVENESS AGAINST DIVERSITY USING OPTIMIZABLE K-DISSIMILARITY AND HIERARCHICAL-CLUSTERING, Journal of chemical information and computer sciences, 38(6), 1998, pp. 1079-1086
Citations number
19
Categorie Soggetti
Computer Science Interdisciplinary Applications","Computer Science Information Systems","Computer Science Interdisciplinary Applications",Chemistry,"Computer Science Information Systems
When assessing the pharmacological potential of large libraries of com
pounds, it is often useful to start by determining the biochemical act
ivities of some subset thereof. This is so whether the compounds in qu
estion have in fact already been synthesized or exist solely as virtua
l libraries. A suitable subset for this task must be structurally dive
rse, so as to minimize redundant testing, but must also be representat
ive, so that valuable subgroups do not get overlooked. These two needs
are intrinsically in conflict, with gains in one necessarily coming a
t the expense of the other. Results obtained using optimizable K-dissi
milarity selection and clustering are described and compared with thos
e obtained using more traditional agglomerative hierarchical clusterin
g techniques.