HIGH STATISTICS BLOCK ENTROPY MEASURES OF DNA-SEQUENCES

Citation
P. Lio et al., HIGH STATISTICS BLOCK ENTROPY MEASURES OF DNA-SEQUENCES, Journal of theoretical biology, 180(2), 1996, pp. 151-160
Citations number
35
Categorie Soggetti
Biology Miscellaneous
ISSN journal
00225193
Volume
180
Issue
2
Year of publication
1996
Pages
151 - 160
Database
ISI
SICI code
0022-5193(1996)180:2<151:HSBEMO>2.0.ZU;2-T
Abstract
We have used an improved block-entropy measure in order to gain some f urther insights into the short-range correlations present in whole chr omosomes of S. cerevisiae, viruses and organelles and very large genom ic regions of E. coli. Although DNA sequences are largely inhomogeneou s and word frequencies are unevenly distributed, the comparison of ent ire chromosomes and large genomic regions show a ''bulk'' composition homogeneity. This property suggests that biases in selection, directio nal mutational pressure and recombination processes act in homogenizin g the base composition of the DNA molecules within a genome but their mode of action, relative impact and direction may vary in different or ganisms. The most interesting results appear to be the differences bet ween the SW (C,G/A,T) and RY (A,G/C,T) two-letter alphabet entropies. Deviations from randomness in E. coil and S. cerevisiae sequences part icularly concern SW dinucleotide frequencies and RY tetranucleotide fr equencies. (C) 1996 Academic Press Limited