Correlations in DNA sequences across the three domains of life

Citation
S. Guharay et al., Correlations in DNA sequences across the three domains of life, PHYSICA D, 146(1-4), 2000, pp. 388-396
Citations number
25
Categorie Soggetti
Physics
Journal title
PHYSICA D
ISSN journal
01672789 → ACNP
Volume
146
Issue
1-4
Year of publication
2000
Pages
388 - 396
Database
ISI
SICI code
0167-2789(20001115)146:1-4<388:CIDSAT>2.0.ZU;2-J
Abstract
We report statistical studies of correlation properties of similar to 7500 gene sequences, covering coding (exon) and non-coding (intron) sequences fo r DNA and primary amino acid sequences for proteins, across all three domai ns of life, namely Eukaryotes (cells with nuclei), Prokaryotes (bacteria) a nd Archaea (archaebacteria). Mutual information function, power spectrum an d Holder exponent analyses show exons with somewhat greater correlation con tent than the introns studied. These results are further confirmed with hyp othesis testing. While similar to 30% of the Eukaryote coding sequences sho w distinct correlations above noise threshold, this is true for only simila r to 10% of the Prokaryote and Archaea coding sequences, for protein sequen ces, we observe correlation lengths similar to that of "random" sequences. (C) 2000 Elsevier Science B.V. All rights reserved.