Arabidopsis thaliana has a relatively small genome of approximately 130 Mb
containing about 10% repetitive DNA. Genome sequencing studies reveal a gen
e-rich genome, predicted to contain approximately 25 000 genes spaced on av
erage every 4.5 kb. Between 10 to 20% of the predicted genes occur as clust
ers of related genes, indicating that local sequence duplication and subseq
uent divergence generates a significant proportion of gene families. In add
ition to gene families, repetitive sequences comprise individual and small
clusters of two to three retroelements and other classes of smaller repeats
. The clustering of highly repetitive elements is a striking feature of the
A. thaliana genome emerging from sequence and other analyses. (C) 2000 Els
evier Science B.V. All rights reserved.