Comparison between physicochemical and calculated molecular descriptors

Citation
Pm. Andersson et al., Comparison between physicochemical and calculated molecular descriptors, J CHEMOMETR, 14(5-6), 2000, pp. 629-642
Citations number
29
Categorie Soggetti
Spectroscopy /Instrumentation/Analytical Sciences
Journal title
JOURNAL OF CHEMOMETRICS
ISSN journal
08869383 → ACNP
Volume
14
Issue
5-6
Year of publication
2000
Pages
629 - 642
Database
ISI
SICI code
0886-9383(200009/12)14:5-6<629:CBPACM>2.0.ZU;2-I
Abstract
It has earlier been proven that measured physicochemical properties are use ful in the selection of building blocks for combinatorial chemistry as well as for investigation of the scope and limitations of organic reactions. Ho wever, measured physicochemical properties are only available for small sub sets of reagents, starting materials or building blocks; therefore it is ne cessary to use calculated descriptors and it is essential that the descript ors are relevant. The objective was to investigate whether three different descriptor data sets contained similar information about the chemical struc ture, with the major aim to investigate whether calculated descriptors cont ain similar information as experimental data. A total of 205 heterogeneous primary amines were characterized using three different data sets of molecu lar descriptor variables. The first set consisted of four physicochemical v ariables compiled from the literature and commercially available chemicals in chemical catalogues. From these four descriptors together with molecular weight, three additional descriptors could be calculated, resulting in a t otal of eight descriptor variables in the first data set. The second data s et consisted of 81 calculated molecular descriptor variables relating to si ze, connectivity, atom count, topology and electrotopology indices. The thi rd data set consisted of 10 semi-empirical variables (AMI). All the calcula ted variables were generated using the software Tsar 3.11. The descriptor v ariable sets were compared using principal component analysis (PCA) and par tial least squares projections to latent structures (PLS). The following re sult shows that the different descriptor sets do contain similar latent inf ormation and that the different types of calculated variables do correlate well with the experimental data, making them suitable to use for e.g. combi natorial library design. Copyright (C) 2000 John Wiley & Sons, Ltd.