The amino acid sequence of the heavy chain of Bombyx mori silk fibroin was
derived from the gene sequence. The 5,263-residue (391-kDa) polypeptide cha
in comprises 12 low-complexity "crystalline" domains made up of Gly-X repea
ts and covering 94% of the sequence; X is Ala in 65%, Ser in 23%, and Tyr i
n 9% of the repeats. The remainder includes a nonrepetitive 151-residue hea
der sequence, 11 nearly identical copies of a 43-residue spacer sequence, a
nd a 58-residue C-terminal sequence. The header sequence is homologous to t
he N-terminal sequence of other fibroins with a completely different crysta
lline region. In Bombyx mori, each crystalline domain is made up of subdoma
ins of similar to 70 residues, which in most cases begin with repeats of th
e GAGAGS hexapeptide and terminate with the GAAS tetrapeptide, Within the s
ubdomains, the Gly-X alternance is strict, which strongly supports the clas
sic Pauling-Corey model, in which X -sheets pack on each other in alternati
ng layers of Gly/Gly and XIX contacts. When fitting the actual sequence to
that model, we propose that each subdomain forms a beta -strand and each cr
ystalline domain a two-layered beta -sandwich, and we suggest that the beta
-sheets may be parallel, rather than antiparallel, as has been assumed up
to now. (C) 2001 Wiley-Liss, Inc.