The gene encoding Thermus filiformis (Tfi) DNA polymerase was cloned a
nd its nucleotide sequence was determined. The primary structure of Tf
i DNA polymerase was deduced from its nucleotide sequence. Tfi DNA pol
ymerase is comprised of 833 amino acid residues and its molecular mass
was determined to be 93,890 Da. The deduced amino acid sequence of Tf
i DNA polymerase showed a high sequence homology to E. coli DNA polyme
rase I-like DNA polymerases: 78.5% homology to Tag DNA polymerase, 78.
4% to Tea DNA polymerase; and 41.8% to E. coli DNA polymerase I. An ex
tremely high sequence identity was observed in the region containing p
olymerase activity. The G+C content of the coding region for the Tfi D
NA polymerase gene was 68.5%, which was higher than that of the chromo
somal DNA (65%). The G+C contents in the first, second, and third posi
tions of the codons used were 71.8%, 40.9%, and 92.7% respectively. Co
don usage in Tfi DNA polymerase was heavily biased towards the use of
G+C in the third position. Rare codons with U or A as the third base w
ere sometimes used to avoid using GA(A/T) TC and TCGA sequences, as th
ey are recognition sites for the restriction endonucleases TfiI and Ta
qI.