COMPOSITIONAL SEGMENTATION AND LONG-RANGE FRACTAL CORRELATIONS IN DNA-SEQUENCES

Citation
P. Bernaolagalvan et al., COMPOSITIONAL SEGMENTATION AND LONG-RANGE FRACTAL CORRELATIONS IN DNA-SEQUENCES, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics, 53(5), 1996, pp. 5181-5189
Citations number
31
Categorie Soggetti
Physycs, Mathematical","Phsycs, Fluid & Plasmas
ISSN journal
1063651X
Volume
53
Issue
5
Year of publication
1996
Part
B
Pages
5181 - 5189
Database
ISI
SICI code
1063-651X(1996)53:5<5181:CSALFC>2.0.ZU;2-R
Abstract
A segmentation algorithm based on the Jensen-Shannon entropic divergen ce is used to decompose long-range correlated DNA sequences into stati stically significant, compositionally homogeneous patches. By adequate ly setting the significance level for segmenting the sequence, the und erlying power-law distribution of patch lengths can be revealed. Some of the identified DNA domains were uncorrelated, but most of them cont inued to display long-range correlations even after several steps of r ecursive segmentation, thus indicating a complex multi-length-scaled s tructure for the sequence. On the other hand, by separately shuffling each segment, or by randomly rearranging the order in which the differ ent segments occur in the sequence, shuffled sequences preserving the original statistical distribution of patch lengths were generated. Bot h types of random sequences displayed the same correlation scaling exp onents as the original DNA sequence, thus demonstrating that neither t he internal structure of patches nor the order in which these are arra nged in the sequence is critical; therefore, long-range correlations i n nucleotide sequences seem to rely only on the power-law distribution of patch lengths.