J. Jurka et C. Pethiyagoda, SIMPLE REPETITIVE DNA-SEQUENCES FROM PRIMATES - COMPILATION AND ANALYSIS, Journal of molecular evolution, 40(2), 1995, pp. 120-126
Simple repeats composed of tandemly repeated units 1-6 nucleotides (nt
) long have been extracted from a selected set of primate genomic DNA
sequences. Of the 501 theoretically possible, different types of repea
ts only 67 were present in the analyzed database in at least two diffe
rent size ranges over 12 nt. They include all simple repeats known to
be polymorphic in the primate genome. A list of moderately expanding a
nd nonexpanding oligonucleotide patterns has also been included. Furth
ermore, we have compiled statistical data with emphasis on the overall
variability of the most abundant 67 types of repeats. We have demonst
rated that the expandability of at least some simple repeats may be af
fected by the overall base composition and by flanking sequences. In p
articular, the occurrence of tandemly repeated CAG and GCC triplets in
exons positively correlates with their G+C content. We also noted tha
t in the vicinity of Alu sequences tetrameric repeats are more abundan
t than in the total genomic DNA. This paper can be used as a comprehen
sive guide in identification of the most abundant and potentially poly
morphic simple repeats. It is also of broader significance as a step t
oward understanding the contribution of flanking sequences and the ove
rall sequence composition to variability of simple repeats.