The QSAR method, using multivariate statistics, was developed by Hansch and
Fujita, and it has been successfully applied to many drug and agrochemical
design problems. As well as speed and simplicity QSAR has advantages of be
ing capable of accounting for some transport and metabolic processes which
occur once the compound is administered.
Until recently QSAR analyses have used relatively simple molecular descript
ors based on substituent constants (e.g., Hammett constants, pi, or molar r
efractivities), physicochemical properties (e,g., partition coefficients),
topological indices (e.g., Randic and Weiner indices). Recently several new
representations have been devised: atomistic; molecular eigenvalues and BC
UT indices derived therefrom; E-state fields; topological autocorrelation v
ectors; various molecular fragment-based hash codes. These representations
have advantages in speed of computation, in more accurately representing mo
lecular properties most relevant to activity, or in being more generally ap
plicable to diverse chemical classes acting at a common receptor, than trad
itional representations.
Historically, linear regression methods such as MLR (multiple linear regres
sion) and PLS (partial least squares) have been used to develop QSAR models
. Regression is an "ill-posed" problem in statistics, which sometimes resul
ts in QSAR models exhibiting instability when trained with noisy data. In a
ddition traditional regression techniques often require subjective decision
s to be made on the part of the investigator as to the likely non-linear re
lationship between structure and activity, and whether there are cross-term
s. Regression methods based on neural networks offer some advantages over M
LR methods as they can account for non-linear SARs, and can deal with linea
r dependencies which sometimes appear in real SAR problems. However, some p
roblems still exist in the development of SAR models using conventional bac
kpropagation neural networks.
We have used a specific type of neural network,the Bayesian Regularized Art
ificial Neural Network (BRANN), in the development of SAR models. The advan
tage of BRANN is that the models are robust and the validation process, whi
ch scales as O(N-2) in normal regression methods, is unnecessary. These net
works have the potential to solve a number of problems which arise in QSAR
modelling such as: choice of model; robustness of model; choice of validati
on set; size of validation effort; and optimization of network architecture
. The application of the methods to QSAR of compounds active at the benzodi
azepine and muscarinic receptors will be illustrated.