A SET OF ALU-FREE FREQUENT DECAMERS FROM MAMMALIAN GENOMES ENRICHED IN TRANSCRIPTION FACTOR SIGNALS

Citation
R. Gambari et al., A SET OF ALU-FREE FREQUENT DECAMERS FROM MAMMALIAN GENOMES ENRICHED IN TRANSCRIPTION FACTOR SIGNALS, Computer applications in the biosciences, 10(5), 1994, pp. 501-508
Citations number
13
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Computer Sciences, Special Topics","Computer Science Interdisciplinary Applications","Biology Miscellaneous
ISSN journal
02667061
Volume
10
Issue
5
Year of publication
1994
Pages
501 - 508
Database
ISI
SICI code
0266-7061(1994)10:5<501:ASOAFD>2.0.ZU;2-Q
Abstract
We have recently reported that the statistical analysis of the frequen cy distribution of short oligonucleotides within mammalian and viral g enomes allows the pr oduction of sets of DNA sequences enriched in sig nals for transcription factors. Such statistical approaches could faci litate the identification of new promoter regions playing a role in th e transcriptional regulation of gene expression. In the case of mammal ian oligonucleotides, we found that the published set of frequent deca mers enriched in transcriptional motifs is not suitable for studies on genes of Home sapiens and evolutionarily related genomes, because it contains decameric sequences belonging to genomic repeats. We report h ere that most of the decameric sequences of DNA repeats belong to Alu repeats. Accordingly, we produced a subset of Alu-free frequent decame rs. In addition, we eliminated from the subset of Alu-free frequent de camers those that are frequently present within other common human rep eats, including (GT)(n), (AT)(n), (CA)(n), (ATT)(n), (CAA)(n), and (GT T)(n). The Alu-free (repeats-free) subset of frequent mammalian decame rs is enriched in signals for transcription factors and allows the ide ntification of putative signals in genes, such as those coding for pla sminogen activator, adenosine deaminase and p53, that contain a large number of Alu-like repeats interspersed within our genomic sequences. The newly generated compilation of frequent decamers described here mi ght be used to locate genomic regions playing functional roles in the expression of genes of Home sapiens and related primates.