PREDICTION OF PROBABLE GENES BY FOURIER-ANALYSIS OF GENOMIC SEQUENCES

Citation
S. Tiwari et al., PREDICTION OF PROBABLE GENES BY FOURIER-ANALYSIS OF GENOMIC SEQUENCES, Computer applications in the biosciences, 13(3), 1997, pp. 263-270
Citations number
32
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Computer Sciences, Special Topics","Computer Science Interdisciplinary Applications","Biology Miscellaneous
ISSN journal
02667061
Volume
13
Issue
3
Year of publication
1997
Pages
263 - 270
Database
ISI
SICI code
0266-7061(1997)13:3<263:POPGBF>2.0.ZU;2-C
Abstract
Motivation: The major signal in coding regions of genomic sequences is a three-base periodicity. Our aim is to use Fourier techniques to ana lyse this periodicity, and thereby to develop a tool to recognize codi ng regions in genomic DNA. Result: The three-base periodicity in the n ucleotide arrangement is evidenced as a sharp peak at frequency f = 1/ 3 in the Fourier (or power) spectrum. From extensive spectral analysis of DNA sequences of total length over 5.5 million base pairs from a w ide variety or organisms (including the human genome), and by separate ly examining coding and non-coding sequences, we find that the relativ e height of the peak at f = 1/3 in the Fourier spectrum is a good disc riminator of coding potential. This feature is utilized by us to detec t probable coding regions in DNA sequences, by examining the local sig nal-to-noise ratio of the peak within a sliding window. While the over all accuracy is comparable to that of other techniques currently in us e, the measure that is presently proposed is independent of training s ets or existing database information, and can thus find general applic ation. Availability: A computer program Gene Scan which locates coding open reading frames and exonic regions in genomic sequences has been developed, and is available on request.