S. Karlin et L. Brocchieri, HEAT-SHOCK-PROTEIN-70 FAMILY - MULTIPLE SEQUENCE COMPARISONS, FUNCTION, AND EVOLUTION, Journal of molecular evolution, 47(5), 1998, pp. 565-577
The heat shook protein 70 kDa sequences (HSP70) are of great importanc
e as molecular chaperones in protein folding and transport. They are a
bundant under conditions of cellular stress. They are highly conserved
in all domains of life: Archaea, eubacteria, eukaryotes, and organell
es (mitochondria, chloroplasts), A multiple alignment of a large colle
ction of these sequences was obtained employing our symmetric-iterativ
e ITERALIGN program (Brocchieri and Karlin 1998). Assessments of conse
rvation are interpreted in evolutionary terms and with respect to func
tional implications. Many archaeal sequences (methanogens and halophil
es) tend to align best with the Gram-positive sequences. These two gro
ups also miss a signature segment [about 25 amino acids (aa) long] pre
sent in all other HSP70 species (Gupta and Golding 1993), We observed
a second signature sequence of about 4 aa absent from all eukaryotic h
omologues, significantly aligned in all prokaryotic sequences. Consens
us sequences were developed for eight groups [Archaea, Gram-positive,
proteobacterial Gramnegative, singular bacteria, mitochondria, plastid
s, eukaryotic endoplasmic reticulum (ER) isoforms, eukaryotic cytoplas
mic isoforms]. All group consensus comparisons tend to summarize bette
r the alignments than do the individual sequence comparisons. The glob
al individual consensus ''matches'' 87% with the consensus of consensu
ses sequence. A functional analysis of the global consensus identifies
a (new) highly significant mixed charge cluster proximal to the carbo
xyl terminus of the sequence highlighting the hypercharge run EEDKKRRE
R (one-letter aa code used). The individual Archaea and Gram-positive
sequences contain a corresponding significant mixed charge cluster in
the location of the charge cluster of the consensus sequence. In contr
ast, the four Gram-negative proteobacterial sequences of the alignment
do not have a charge cluster (even at the 5% significance level). All
eukaryotic HSP70 sequences have the analogous charge cluster. Strikin
gly, several of the eukaryotic isoforms show multiple mixed charged cl
usters. These clusters were interpreted with supporting data related t
o HSP70 activity in facilitating chaperone, transport, and secretion f
unction. We observed that the consensus contains only a single tryptop
han residue and a single conserved cysteine. This is interpreted with
respect to the target rule for disaggregating misfolded proteins. The
mitochondrial HSP70 connections to bacterial HSP70 are analyzed, sugge
sting a polyphyletic split of Trypanosoma and Leishmania protist mitoc
hondrial (Mt) homologues separated from Mt-animal/fungal/plant homolog
ues. Moreover, the HSP70 sequences from the amitochondrial Entamoeba h
istolytica and Trichomonas vaginalis species were analyzed. The E. his
tolytica HSP70 is most similar to the higher eukaryotic cytoplasmic se
quences, with significantly weaker alignments to ER sequences and much
diminished matching to all eubacterial, mitochondrial, and chloroplas
t sequences. This appears to be at variance with the hypothesis that E
. histolytica rather recently lost its mitochondrial organelle. T. vag
inalis contains two HSP70 sequences, one Mt-like and the second simila
r to eukaryotic cytoplasmic sequences suggesting two diverse origins.