S. Fukuchi et al., EVOLUTION OF GENETIC INFORMATION-FLOW FROM THE VIEWPOINT OF PROTEIN-SEQUENCE SIMILARITY, Journal of theoretical biology, 171(2), 1994, pp. 179-195
As a course of inquiry into the evolution of genetic information flow,
similarity relations of amino acid sequences between the proteins inv
olved in translation, transcription and replication are investigated.
The sequence data of these proteins are mostly accumulated from Escher
ichia coli, and the present investigation is carried out mainly on thi
s organism by the FASTP program. This result reveals an interesting si
milarity linkage extending from ribosomal proteins to the proteins par
ticipating in translational elongation process and to the proteins in
transcription and replication. Although the ribosomal proteins are of
relatively short polypeptide chains, our systematic comparison between
these proteins finds many similarity relations, being more than 100 i
n terms of ''overlap'', reducing them to about 14 elementary ribosomal
proteins from which other ribosomal proteins would have diverged. Mor
eover, the proteins involved in translation, transcription and replica
tion contain the regions similar to the elementary ribosomal proteins.
In particular, some initiation and elongation factors in translation
process are assigned to be similar to the elementary ribosomal protein
s almost over the whole regions. To such an elongation factor Tu, the
alpha and sigma(70) subunits of RNA polymerase and primase also show s
imilarity in the wider regions than the individual ribosomal proteins,
and they are shown to be fundamental for the similarity linkage exten
ding to the other polypeptide chains involved in transcription and rep
lication processes, although the latter polypeptide chains contain reg
ions not similar to any ribosomal protein. This divergence pattern of
similarity relations strongly suggests that the proteins involved in t
he contemporary genetic information flow DNA-->RNA-->protein have evol
ved from some elementary ribosomal proteins, first by gene fusion, in
a primitive organism of the RNA-protein world, and then by the additio
n of the mechanism of domain shuffling from other genes in the DNA-RNA
-protein world.