Bc. Meyers et al., Abundance, distribution, and transcriptional activity of repetitive elements in the maize genome, GENOME RES, 11(10), 2001, pp. 1660-1676
Long terminal repeat (LTR) retrotransposons have been shown to make up much
of the maize genome. Although these elements are known to be prevalent in
plant genomes of a middle-to-large size, little information is available on
the relative proportions composed by specific families of elements in a si
ngle genome. We sequenced a library of randomly sheared genomic DNA from ma
ize to characterize this genome. BLAST analysis of these sequences demonstr
ated that the maize genome is composed of diverse sequences that represent
numerous families of retrotransposons. The largest families contain the pre
viously described elements Huck, Ji, and Opie. Approximately 5% of the sequ
ences are predicted to encode proteins. The genomic abundance of 16 familie
s of elements was estimated by hybridization to an array of 10,752 maize ba
cterial artificial chromosome (BAC) clones. Comparisons of the number of el
ements present on individual BACs indicated that retrotransposons are in ge
neral randomly distributed across the maize genome. A second library was co
nstructed that was selected to contain sequences hypomethylated in the maiz
e genome. Sequence analysis of this library indicated that retroelements ab
undant in the genome are poorly represented in hypomethylated regions. Fift
y-six retroelement sequences corresponding to the integrase and reverse tra
nscriptase domains were isolated from similar to 407,000 maize expressed se
quence tags (ESTs). Phylogenetic analysis of these and the genomic retroele
ment sequences indicated that elements most abundant in the genome are less
abundant at the transcript level than are more rare retrotransposons. Addi
tional phylogenies also demonstrated that rice and maize retrotransposon fa
milies are frequently more closely related to each other than to families w
ithin the same species. An analysis of the GC content of the maize genomic
library and that of maize ESTs did not support recently published data that
the gene space in maize is found within a narrow GC range, but does indica
te that genic sequences have a higher GC content than intergenic sequences
(52% vs. 47% GC).