We have recently developed a novel strategy for the rational design of comp
ounds. This 'in silico screening' approach is based on the design and scree
ning of virtual combinatorial libraries. Screening is performed using defin
ed rules derived from a comprehensive description of active and inactive mo
lecules in a relevant learning set. This strategy allows the development of
potential ligands without the necessity of any knowledge of the 3D-structu
re of the target receptor. Key to the success of such methods is the qualit
y of the information being processed, in particular, the diversity of the d
ata in the context of the molecular population in the libraries concerned.
Here, we review the problem of data diversity, its definition and its analy
sis using a new software tool, named Diverser.