R. Gambari et al., A SET OF ALU-FREE FREQUENT DECAMERS FROM MAMMALIAN GENOMES ENRICHED IN TRANSCRIPTION FACTOR SIGNALS, Computer applications in the biosciences, 10(5), 1994, pp. 501-508
We have recently reported that the statistical analysis of the frequen
cy distribution of short oligonucleotides within mammalian and viral g
enomes allows the pr oduction of sets of DNA sequences enriched in sig
nals for transcription factors. Such statistical approaches could faci
litate the identification of new promoter regions playing a role in th
e transcriptional regulation of gene expression. In the case of mammal
ian oligonucleotides, we found that the published set of frequent deca
mers enriched in transcriptional motifs is not suitable for studies on
genes of Home sapiens and evolutionarily related genomes, because it
contains decameric sequences belonging to genomic repeats. We report h
ere that most of the decameric sequences of DNA repeats belong to Alu
repeats. Accordingly, we produced a subset of Alu-free frequent decame
rs. In addition, we eliminated from the subset of Alu-free frequent de
camers those that are frequently present within other common human rep
eats, including (GT)(n), (AT)(n), (CA)(n), (ATT)(n), (CAA)(n), and (GT
T)(n). The Alu-free (repeats-free) subset of frequent mammalian decame
rs is enriched in signals for transcription factors and allows the ide
ntification of putative signals in genes, such as those coding for pla
sminogen activator, adenosine deaminase and p53, that contain a large
number of Alu-like repeats interspersed within our genomic sequences.
The newly generated compilation of frequent decamers described here mi
ght be used to locate genomic regions playing functional roles in the
expression of genes of Home sapiens and related primates.