Characterization of DNA primary sequences based on the average distances between bases

Citation
M. Randic et Sc. Basak, Characterization of DNA primary sequences based on the average distances between bases, J CHEM INF, 41(3), 2001, pp. 561-568
Citations number
74
Categorie Soggetti
Chemistry
Journal title
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES
ISSN journal
00952338 → ACNP
Volume
41
Issue
3
Year of publication
2001
Pages
561 - 568
Database
ISI
SICI code
0095-2338(200105/06)41:3<561:CODPSB>2.0.ZU;2-S
Abstract
We outline numerical characterization of DNA primary sequence based on calc ulation of the average distance between pairs of nucleic acid bases. This l eads to a representation of DNA by a condensed 4 x 4 symmetrical matrix, th e elements of which give the average separation between pair of bases X, Y in DNA (X, Y = A, C, G, T). As an invariant of choice we consider the leadi ng eigenvalue of the derived 4 x 4 matrix. Additional structurally related invariants were obtained by constructing additional "higher order" 4 x 4 ma trices derived from the initial 4 x 4 matrix by raising its elements to hig her powers. Suitably normalized leading eigenvalue of these matrices offer a novel characterization of DNA primary sequences, referred to as "DNA prof iles". The approach is illustrated on exon 1 of human beta -globin gene.