The Arabidopsis thaliana genome contains at least 29 active genes encodingSET domain proteins that can be assigned to four evolutionarily conserved classes

Citation
Lo. Baumbusch et al., The Arabidopsis thaliana genome contains at least 29 active genes encodingSET domain proteins that can be assigned to four evolutionarily conserved classes, NUCL ACID R, 29(21), 2001, pp. 4319-4333
Citations number
62
Categorie Soggetti
Biochemistry & Biophysics
Journal title
NUCLEIC ACIDS RESEARCH
ISSN journal
03051048 → ACNP
Volume
29
Issue
21
Year of publication
2001
Pages
4319 - 4333
Database
ISI
SICI code
0305-1048(20011101)29:21<4319:TATGCA>2.0.ZU;2-1
Abstract
SET domains are conserved amino acid motifs present in chromosomal proteins that function in epigenetic control of gene expression. These proteins can be divided into four classes as typified by their Drosophila members E(Z), TRX, ASH1 and SU(VAR)3-9. Homologs of all four classes have been identifie d in yeast and mammals, but not in plants. A BLASTP screening of the Arabid opsis genome identified 37 genes: three E(z) homologs, five trx homologs, f our ash1 homologs and 15 genes similar to Su(var)3-9. Seven genes were assi gned as trx-related and three as ash1-related. Only four genes have been de scribed previously. Our classification is based on the characteristics of t he SET domains, cysteine-rich regions and additional conserved domains, inc luding a novel YGD domain. RT-PCR analysis, cDNA cloning and matching ESTs show that at least 29 of the genes are active in diverse tissues. The high number of SET domain genes, possibly involved in epigenetic control of gene activity during plant development, can partly be explained by extensive ge nome duplication in Arabidopsis. Additionally, the lack of introns in the c oding region of eight SU(VAR)3-9 class genes indicates evolution of new gen es by retrotransposition. The identification of putative nuclear localizati on signals and AT-hooks in many of the proteins supports an anticipated nuc lear localization, which was demonstrated for selected proteins.