N. Tolstrup et al., NEURAL-NETWORK MODEL OF THE GENETIC-CODE IS STRONGLY CORRELATED TO THE GES SCALE OF AMINO-ACID TRANSFER FREE-ENERGIES, Journal of Molecular Biology, 243(5), 1994, pp. 816-820
A neural network trained to classify the 61 nucleotide triplets of the
genetic code into 20 amino acid categories develops in its internal r
epresentation a pattern matching the relative cost of transferring ami
no acids with satisfied backbone hydrogen bonds from water to an envir
onment of dielectric constant of roughly 2.0. Such environments are ty
pically found in lipid membranes or in the interior of proteins. In le
arning the mapping between the codons and the categories, the network
groups the amino acids according to the scale of transfer free energie
s developed by Engelman, Goldman and Steitz. Several other scales base
d on internal preference statistics also agree reasonably well with th
e network grouping. The network is able to relate the structure of the
genetic code to quantifications of amino acid hydrophobicity-hydrophi
licity more systematically than the numerous attempts made earlier. Du
e to its inherent non-linearity the code is also shown to impose decis
ive constraints on algorithmic analysis of the protein coding potentia
l of DNA.