In order to better understand the common features present in drug mole
cules, we use shape description methods to analyze a database of comme
rcially available drugs and prepare a list of common drug shapes. A us
eful way of organizing this structural data is to group the atoms of e
ach drug molecule into ring, linker, framework, and side chain atoms.
On the basis of the two-dimensional molecular structures (without rega
rd to atom type, hybridization; and bond order), there are 1179 differ
ent frameworks among the 5120 compounds analyzed. However, the shapes
of half of the drugs in the database are described by the 32 most freq
uently occurring frameworks. This suggests that the diversity of shape
s in the set of known drugs is extremely low. In our second method of
analysis, in which atom type, hybridization, and bond order are consid
ered, more diversity is seen; there are 2506 different frameworks amon
g the 5120 compounds in the database, and the most frequently occurrin
g 42 frameworks account for only one-fourth of the drugs. We discuss t
he possible interpretations of these findings and the way they may be
used to guide future drug discovery research.