Sn. Vinogradov et al., ADVENTITIOUS VARIABILITY - THE AMINO-ACID-SEQUENCES OF NONVERTEBRATE GLOBINS, Comparative biochemistry and physiology. B. Comparative biochemistry, 106(1), 1993, pp. 1-26
1. The more than 140 amino acid sequences of non-vertebrate hemoglobin
s (Hbs) and myoglobins (Mbs) that are known at present, can be divided
into several distinct groups: (1) single-chain globins, containing on
e heme-binding domain; (2) truncated, single-chain, one-domain globins
; (3) chimeric, one-domain globins; (4) chimeric, two-domain globins;
and (5) chimeric multi-domain globins. 2. The crystal structures of ei
ght nonvertebrate Hbs and Mbs are known, all of them monomeric, one-do
main globin chains. Although these molecules represent plants, prokary
otes and several metazoan groups, and although the inter-subunit inter
actions in the dimeric and tetrameric molecules differ from the ones o
bserved in vertebrate Hbs, the secondary structures of all seven one-d
omain globins retain the characteristic vertebrate ''myoglobin fold''.
No crystal structures of globins representing the other four groups h
ave been determined. 3. Furthermore, a number of the one-, two- and mu
lti-domain globin chains participate in a broad variety of quaternary
structures, ranging from homo- and heterodimers to highly complex, mul
tisubunit aggregates with M(r) > 3000 kDa (S. N. Vinogradov, Comp. Bio
chem. Physiol. 82B, 1- 1 5, 1985). 4. (1) The single-chain, single-dom
ain globins are comparable in size to the vertebrate globins and exhib
it the widest distribution. (A) Intracellular Hbs include: (i) the mon
omeric and polymeric Hbs of the polychaete Glycera; (ii) the tetrameri
c Hb of the echiuran Urechis; (iii) the dimeric Hbs of echinoderms suc
h as Paracaudina and Caudina; and (iv) the dimeric and tetrameric Hbs
of molluscs, the bivalves Scapharca, Anadara, Barbatia and Calyptogena
. (B) Extracellular Hbs include: (i) the multiple monomeric and dimeri
c Hbs of the larva of the insect Chironomus; (ii) the Hbs of nematodes
such as Trichostrongylus and Caenorhabditis; (iii) the globin chains
forming tetramers and dodecamers and comprising approximately 2/3 of t
he giant (approximately 3600 kDa), hexagonal bilayer (HBL) Hbs of anne
lids, e.g. the oligochaete Lumbricus and the polychaete Tylorrhynchus
and of the vestimentiferan Lamellibrachia; and (iv) the globin chains
comprising the ca 400 kDa Hbs of Lamellibrachia and the pogonophoran O
ligobrachia. (C) Cytoplasmic Hbs include: (i) the Mbs of molluscs, the
gastropods Aplysia, Bursatella, Cerithedea, Nassa and Dolabella and t
he chiton Liolophura; (ii) the three Hb of the symbiont-harboring biva
lve Lucina; (iii) the dimeric Hb of the bacterium Vitreoscilla; and (i
v) plant Hbs, including the Hbs of symbiont-containing legumes (Lgbs),
the Hbs of symbiont-containing non-leguminous plants and the Hbs in t
he roots of symbiont-free plants. 5. (2) Truncated, single-chain, sing
le-domain globins occur in: (i) the ciliated protozoa Paramecium and T
etrahymena, comprising 116 and 121 residues, respectively; (ii) in the
cyanobacterium Nostoc (118 residues) and (iii), in the nemertean Cere
bratulus (109 residues). 6. (3) Chimeric greater-than-or-equal-to 40 k
Da globins include: (i) the cytoplasmic Hbs in bacteria such as E. col
i, Rhizobium and Alcaligenes; and (ii) in the yeasts Saccharomyces and
Candida. They have an N-terminal heme-binding domain attached to unre
lated proteins with diverse functions and represent, according to Rigg
s, a previously unrecognized evolutionary pathway for hemoglobin. In t
he case of Rhizobium, the relationship of the heme-binding domain to o
ther globins is tenuous. The cytoplasmic Hb of the archeopstropod Sulc
ulus has an internal heme-binding domain within a chain of 377 residue
s, whose sequence cannot be properly aligned with other globins. Howev
er, the overall primary structure has a very substantial homology to h
uman indoleamine 2,3-dioxygenase, suggesting that Sulculus Mb is a cas
e of convergent evolution. 7. (4) Chimeric, approximately 40 kDa globi
ns, containing two, covalently linked heme-binding domains, comprise:
(i) the extracellular, high-affinity, octameric (approximately 320 kDa
) Hbs of parasitic nematodes such as Pseudoterranova and Ascaris; and
(ii) the polymeric (ca 430 kDa) intracellular Hb of the clam Barbatia.
8. (5) Chimeric, linear, covalently-linked multi-domain globin sequen
ces are represented so far by the cDNA sequence of one of the two chai
ns comprising the extracellular Hb (approximately 250 kDa) of a crusta
cean, the brine shrimp Artemia, and consisting of a linear arrangement
of nine heme-binding domains linked covalently by 10-20 residue seque
nces. 9. The giant, extracellular HBL Hbs of annelids and vestimentife
rans, appear to consist of large complexes of four chemically distinct
, single-domain globins (ca 144 chains), linked together by at least a
pproximately 36 25-28kDa chains which are heme-deficient. The known se
quences of linker chains, two from Tylorrhynchus Hb, and one each from
Lumbricus and Lamellibrachia Hbs cannot be properly aligned with the
known globin sequences. Furthermore, recent work by Suzuki and Riggs i
ndicates that the gene of one of the Lumbricus linker chains is unrela
ted to globin genes. 10. In some cases nonvertebrate Hbs exhibit tissu
e and developmental stage specificity. In several instances such as Ch
ironomus, Glycera and Paramecium, the number of chemically distinct gl
obin chains appears to be much greater than is usually observed among
vertebrate Hbs. In the case of Chironomus, it appears that, despite th
e presence of normal upstream and downstream regulatory regions, only
a fraction of the large number (> 40) of putative globin genes is expr
essed at significant levels. Furthermore, although the majority of Chi
ronomus globin genes are intronless, at least one group of its globin
genes has introns. 11. The widespread, if episodic occurrence of singl
e-chain Hbs in very diverse groups of eukaryotes and prokaryotes sugge
sts that the Hbs observed at present are likely to have descended from
an ancient, monomeric, single-chain, single-domain globin, which exis
ted prior to the time of divergence of prokaryotes and eukaryotes (150
0-2000 Myr). This view is consonant with the possibility that globin g
enes may be ubiquitous though not always expressed (Riggs, Am. Zool. 3
1, 535-545, 1991; Vinogradov et al., Comp. Biochem. Physiol. 103B, 759
-773 1992).