Long- and short-range correlations in genome organization

Citation
Y. Almirantis et A. Provata, Long- and short-range correlations in genome organization, J STAT PHYS, 97(1-2), 1999, pp. 233-262
Citations number
26
Categorie Soggetti
Physics
Journal title
JOURNAL OF STATISTICAL PHYSICS
ISSN journal
00224715 → ACNP
Volume
97
Issue
1-2
Year of publication
1999
Pages
233 - 262
Database
ISI
SICI code
0022-4715(199910)97:1-2<233:LASCIG>2.0.ZU;2-W
Abstract
We study the size distribution of coding and non-coding regions in DNA sequ ences. For most organisms we observe that the size distribution P-c(S) of t he coding regions of size S shows short range distribution, whereas the siz e distribution of the non-coding regions follows a power-law decay P-nc(S)s imilar to S-1-mu with power exponents indicating clear long-range behavior. We argue, using the Generalized Central Limit Theorem, that the long-range distributions observed in the non-coding are related to the lower level cl ustering of purines and pyrimidines (1d islands) which follow similar long- range laws. We also address the question of clustering of coding segments i n the two complementary strands of DNA. We observe a short-range clustering of coding regions in both strands, expressed by an exponential decay in th e clustering size distribution. The decay exponent expresses the degree of short-range correlations and the deviation from random clustering.