R. Aurora et Gd. Rose, SEEKING AN ANCIENT ENZYME IN METHANOCOCCUS-JANNASCHII USING ORF, A PROGRAM BASED ON PREDICTED SECONDARY STRUCTURE COMPARISONS, Proceedings of the National Academy of Sciences of the United Statesof America, 95(6), 1998, pp. 2818-2823
We have developed a simple procedure to identify protein homologs in g
enomic databases. The program, called ORF, is based on comparisons of
predicted secondary structure. Protein structure is far better conserv
ed than amino acid sequence, and structure-based methods have been eff
ective in exploiting this fact to find homologs, even among proteins w
ith scant sequence identity. ORF is a secondary structure-based method
that operates solely on predictions from sequence and requires no exp
erimentally determined information about the structure. The approach i
s illustrated by an example: Thymidylate synthase, a highly conserved
enzyme essential to thymidine biosynthesis in both prokaryotes and euk
aryotes, is thought to be used by Archaea, but a corresponding gene ha
s yet to be identified. Here, a candidate thymidylate synthase is iden
tified as a previously unassigned open reading frame from the genome o
f Methanococcus jannaschii, viz., MJ0757. Using primary structure info
rmation alone, the optimally aligned sequence identity between MJ0757
and Escherichia coli thymidylate synthase is 7%, well below the thresh
old of sensitivity for detection by sequence-based methods.