Literature data on compounds both well- and poorly-absorbed in humans were
used to build a statistical pattern recognition model of passive intestinal
absorption. Robust outlier detection was utilized to analyze the well-abso
rbed compounds, some of which were intermingled with the poorly-absorbed co
mpounds in the model space. Outliers were identified as being actively tran
sported. The descriptors chosen for inclusion in the model were PSA and Alo
gP98, based on consideration of the physical processes involved in membrane
permeability and the interrelationships and redundancies between available
descriptors. These descriptors are quite straightforward for a medicinal c
hemist to interpret, enhancing the utility of the model. Molecular weight,
while often used in passive absorption models, was shown to be superfluous,
as it is already a component of both PSA and AlogP98. Extensive validation
of the model on hundreds of known orally delivered drugs, "drug-like" mole
cules, and Pharmacopeia, Inc. compounds, which had been assayed for Caco-2
cell permeability, demonstrated a good rate of successful predictions (74-9
2%, depending on the dataset and exact criterion used).