ITA
ENG

A FAST ALGORITHM FOR SELECTING SETS OF DISSIMILAR MOLECULES FROM LARGE CHEMICAL DATABASES

Authors

HOLLIDAY JD RANADE SS WILLETT P

Citation

Jd. Holliday et al., A FAST ALGORITHM FOR SELECTING SETS OF DISSIMILAR MOLECULES FROM LARGE CHEMICAL DATABASES, Quantitative structure-activity relationships, 14(6), 1995, pp. 501-506

Citations number

Categorie Soggetti

Pharmacology & Pharmacy

Journal title

Quantitative structure-activity relationships → ACNP

ISSN journal

09318771

Volume

Issue

Year of publication

1995

Pages

501 - 506

Database

ISI

SICI code

0931-8771(1995)14:6<501:AFAFSS>2.0.ZU;2-J

Abstract

Current algorithms for the selection of a set of n dissimilar molecule s from a dataset of N molecules have an expected time complexity of O( n(2)N). This paper describes an improved algorithm that has an expecte d time complexity of O(nN) and that will identify exactly the same set of molecules as the normal algorithm if the cosine coefficient is use d for the calculation of the inter-molecular (dis)similarities. The al gorithm is applicable to any type of representation that characterises a molecule by a set of attribute values and to any procedure that inv olves calculating a sum of inter-molecular similarities. It is also bo th more effective and more efficient than our implementation of a gene tic algorithm for the selection of maximally-dissimilar sets of molecu les.