We describe an approach to isolate molecular scaffolds and R-groups from kn
own chemical compounds in order to generate scaffold and R-group databases
from two large compound collections, Optiverse(TM) and Maybridge(TM). The d
istributions of molecular scaffolds and R-groups in the parent databases we
re analysed and compared. We find that a limited number of scaffolds and R-
groups account for the majority of database compounds and that most of the
scaffolds occur only once or twice in the compound databases. Diversity ana
lysis suggests that the compound and scaffold databases have similar molecu
lar diversity. Implications for library design are discussed.