The complete sequence of the Bombyx mori fibroin gene has been determined b
y means of combining a shotgun sequencing strategy with physical map-based
sequencing procedures. It consists of two exons (67 and 15 750 bp, respecti
vely) and one intron (971 bp). The fibroin coding sequence presents a spect
acular organization, with a highly repetitive and G-rich (similar to 45%) c
ore flanked by non-repetitive 5' and 3' ends. This repetitive core is compo
sed of alternate arrays of 12 repetitive and 11 amorphous domains. The sequ
ences of the amorphous domains are evolutionarily conserved and the repetit
ive domains differ from each other in length by a variety of tandem repeats
of subdomains of similar to 208 bp which are reminiscent of the repetitive
nucleosome organization. A typical composition of a subdomain is a cluster
of repetitive units, Ua, followed by a cluster of units, Ub, (with a Ua:Ub
ratio of 2:1) flanked by conserved boundary elements at the 3' end. Moreov
er some repeats are also perfectly conserved at the peptide level indicatin
g that the evolutionary pressure is not identical along the sequence. A ten
tative model for the constitution and evolution of this unusual gene is dis
cussed.