Jr. Cort et al., A phylogenetic approach to target selection for structural genomics: solution structure of YciH, NUCL ACID R, 27(20), 1999, pp. 4018-4027
Structural genomics presents an enormous challenge with up to 100 000 prote
in targets in the human genome alone. At current rates of structure determi
nation, judicious selection of targets is neccessary. Here, a phylogenetic
approach to target selection is described which makes use of the National C
enter for Biotechnology Information database of Clusters of Orthologous Gro
ups (COGS). The strategy is designed so that each new protein structure is
likely to provide novel sequence-fold information. To demonstrate this appr
oach, the NMR solution structure of YciH (COG0023), a putative translation
initiation factor from Escherichia coli, has been determined and its fold c
lassified. YciH is an ortholog of eIF-1/SUI1, an integral component of the
translation initiation complex in eukaryotes, The structure consists of two
antiparallel alpha-helices packed against the same side of a five-stranded
beta-sheet. The first 31 residues of the 11.5 kDa protein are unstructured
in solution. Comparative analysis indicates that the folded portion of Yci
H resembles a number of structures with the alpha-beta plait topology, thou
gh its sequence is not homologous to any of them. Thus, the phylogenetic ap
proach to target selection described here was used successfully to identify
a new homologous superfamily within this topology.