IDENTIFICATION OF A SET OF FREQUENT DECANUCLEOTIDES IN PLANTS AND IN ANIMALS

Citation
C. Scapoli et al., IDENTIFICATION OF A SET OF FREQUENT DECANUCLEOTIDES IN PLANTS AND IN ANIMALS, Computer applications in the biosciences, 10(5), 1994, pp. 465-470
Citations number
17
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Computer Sciences, Special Topics","Computer Science Interdisciplinary Applications","Biology Miscellaneous
ISSN journal
02667061
Volume
10
Issue
5
Year of publication
1994
Pages
465 - 470
Database
ISI
SICI code
0266-7061(1994)10:5<465:IOASOF>2.0.ZU;2-S
Abstract
We studied the frequency distribution of 1 048 576 oligonucleotides 10 bp long in a sample of 1.961 Mbase of genes from plants, made of 635 sequences extracted from GenBank 71.0, with the aim of detecting trans cription control signals. Among all decamers, 3255, or 0.3%, had a fre quency 10 times higher than the mean and were subjected to further sta tistical analysis. For each of the 3255 decamers (parents), we counted the individual frequencies of the 30 decamers (progeny) differing fro m the parent by one base mutation, and calculated two variance/mean ch i-squares for the progeny, with and without the parent decamer. By stu dying the distribution of the ratio between the two chi-squares we obs erved that out of 3255 decamers >10 times frequent than average, 432 h ad a chi-square ratio >1.9. In this residual set, which corresponds to <0.04 per cent of all possible decamers, only 15 known eukaryotic tra nscription control elements were found; on the other hand, it included 29 decanucleotides that matched with decanucleotides of a set of Dros ophila, 24 with a set from mammals, 13 with a set from yeast and four with a set of viruses-all sets identified with the statistical procedu res here described. These decanucleotides are highly repetitive and se em to be present throughout all higher organisms, whereas they are unc ommon in mammalian viruses.