Ut. Gerngross et al., SEQUENCING OF A CLOSTRIDIUM-THERMOCELLUM GENE (CIPA) ENCODING THE CELLULOSOMAL S(L)-PROTEIN REVEALS AN UNUSUAL DEGREE OF INTERNAL HOMOLOGY, Molecular microbiology, 8(2), 1993, pp. 325-334
It is known that two proteins of the cellulosomal complex of Clostridi
um thermocellum (S(L) and S(s)) together degrade crystalline cellulose
. S(L) is a glycoprotein of 210 000 Da which enhances the binding to c
ellulose and the activity of S(s), an endoglucanase of 83000 Da. We ha
ve previously reported the cloning of a DNA fragment encoding the N-te
rminal end of the S(L) protein using antibodies raised against the nat
ive protein. A chromosomal walking approach using an EcoRI and a Bam H
I-Sau 3A gene library allowed us to isolate the C-terminal end of the
gene. Sequencing of both fragments revealed the existence of a leader
peptide as has been found in cellulases of the same organism. This lea
der sequence is followed by a stretch of 14 amino acids that is identi
cal to the N-terminal amino acid sequence of the native secreted prote
in. The open reading frame (ORF) of this gene encodes a protein of 196
800 Da and is followed by a hairpin loop that could be involved in tr
anscription termination. Within the open reading frame (ORF), we found
nine internal repeated elements (IREs) of about 500 nucleotides each.
Seven of these sequences displayed 98-100% homology and were located
adjacent to each other within the structural gene without intervening
regions. The remaining two, located on the N-terminal end of the gene,
showed a significantly lower homology. Bearing in mind the inherent i
nstability of reiterated regions, we confirmed the authenticity of our
clones by Southern blot analysis using chromosomal C. thermocellum DN
A and ruled out the possibility of rearrangements during the cloning a
nd sequencing process. The sequenced gene is designated cipA and the e
ncoded SL protein CipA.