WAVELET-BASED FRACTAL ANALYSIS OF DNA-SEQUENCES

Citation
A. Arneodo et al., WAVELET-BASED FRACTAL ANALYSIS OF DNA-SEQUENCES, Physica. D, 96(1-4), 1996, pp. 291-320
Citations number
147
Categorie Soggetti
Mathematical Method, Physical Science",Physics,"Physycs, Mathematical
Journal title
ISSN journal
01672789
Volume
96
Issue
1-4
Year of publication
1996
Pages
291 - 320
Database
ISI
SICI code
0167-2789(1996)96:1-4<291:WFAOD>2.0.ZU;2-2
Abstract
The fractal scaling properties of DNA sequences are analyzed using the wavelet transform. Mapping nucleotide sequences onto a ''DNA walk'' p roduces fractal landscapes that can be studied quantitatively by apply ing the so-called wavelet transform modulus maxima method. This method provides a natural generalization of the classical box-counting techn iques to fractal signals, the wavelets playing the role of ''generaliz ed oscillating boxes''. From the scaling behavior of partition functio ns that are defined from the wavelet transform modulus maxima, this me thod allows us to determine the singularity spectrum of the considered signal and thereby to achieve a complete multifractal analysis, Moreo ver, by considering analyzing wavelets that make the ''wavelet transfo rm microscope'' blind to ''patches'' of different nucleotide compositi on that are observed in mic sequences, we demonstrate and quantify the existence of long-range correlations in the noncoding regions. Althou gh the fluctuations in the patchy landscape of the DNA walks reconstru cted from both noncoding and (protein) coding regions are found homoge neous with Gaussian statistics, our wavelet-based analysis allows us t o discriminate unambiguously between the fluctuations of the former wh ich behave like fractional Brownian motions, from those of the latter which cannot be distinguished from uncorrelated random Brownian walks. We discuss the robustness of these results with respect to various le gitimate codings of the DNA sequences, Finally, we comment about the p ossible understanding of the origin of the observed long-range correla tions in noncoding DNA sequences in terms of the nonequilibrium dynami cal processes that produce the ''isochore structure of the genome''.