TRANSCRIPTION FACTOR-IID IN THE ARCHAEA - SEQUENCES IN THE THERMOCOCCUS CELER GENOME WOULD ENCODE A PRODUCT CLOSELY-RELATED TO THE TATA-BINDING PROTEIN OF EUKARYOTES
Tl. Marsh et al., TRANSCRIPTION FACTOR-IID IN THE ARCHAEA - SEQUENCES IN THE THERMOCOCCUS CELER GENOME WOULD ENCODE A PRODUCT CLOSELY-RELATED TO THE TATA-BINDING PROTEIN OF EUKARYOTES, Proceedings of the National Academy of Sciences of the United Statesof America, 91(10), 1994, pp. 4180-4184
The first step in transcription initiation in eukaryotes is mediated b
y the TATA-binding protein, a subunit of the transcription factor IID
complex. We have cloned and sequenced the gene for a presumptive homol
og of this eukaryotic protein from Thermococcus celer, a member of the
Archaea (formerly archaebacteria). The protein encoded by the archaea
l gene is a tandem repeat of a conserved domain, corresponding to the
repeated domain in its eukaryotic counterparts. Molecular phylogenetic
analyses of the two halves of the repeat are consistent with the dupl
ication occurring before the divergence of the archaeal and eukaryotic
domains. In conjunction with previous observations of similarity in R
NA polymerase subunit composition and sequences and the finding of a t
ranscription factor IIB-like sequence in Pyrococcus woesei (a relative
of T. celer) it appears that major features of the eukaryotic transcr
iption apparatus were well-established before the origin of eukaryotic
cellular organization. The divergence between the two halves of the a
rchaeal protein is less than that between the halves of the individual
eukaryotic sequences, indicating that the average rate of sequence ch
ange in the archaeal protein has been less than in its eukaryotic coun
terparts. To the extent that this lower rate applies to the genome as
a whole, a clearer picture of the early genes (and gene families) that
gave rise to present-day genomes is more apt to emerge from the study
of sequences from the Archaea than from the corresponding sequences f
rom eukaryotes.