D. Malacarne et al., RELATIONSHIP BETWEEN MOLECULAR CONNECTIVITY AND CARCINOGENIC ACTIVITY- A CONFIRMATION WITH A NEW SOFTWARE PROGRAM BASED ON GRAPH-THEORY, Environmental health perspectives, 101(4), 1993, pp. 332-342
For a database of 826 chemicals tested for carcinogenicity we fragment
ed the structural formula of the chemicals into all possible contiguou
s-atom fragments with size between two and eight (nonhydrogen) atoms.
The fragmentation was obtained using a new software program bawd on gr
aph theory. We used 80% of the chemicals as a training wt and 20% as a
test wt. The two sets were obtained by random sorting. From the train
ing sets, an average (8 computer runs with independently sorted chemic
als) of 315 different fragments were significantly (p<0.125) associate
d with carcinogenicity or lack thereof. Even using this relatively low
level of statistical significance, 23% of the molecules of the test s
ets lacked significant fragments. For 77% of the molecules of the test
sets, we used the presence of significant fragments to predict carcin
ogenicity. The average level of accuracy of the predictions in the tes
t sets was 67.5%. Chemicals containing only positive fragments were pr
edicted with an accuracy of 78.7%. The level of accuracy was around 60
% for chemicals characterized by contradictory fragments or only negat
ive fragments. In a parallel manner, we performed eight paired runs in
which carcinogenicity was attributed randomly to the molecules of the
training sets. The fragments generated by these pseudotraining sets w
ere devoid of any predictivity in the corresponding test sets. Using a
n independent software program, we confirmed (for the complex biologic
al endpoint of carcinogenicity) the validity of a structure-activity r
elationship approach of the type proposed by Klopman and Rosenkranz wi
th their CASE program.