ITA
ENG

IDENTIFICATION AND CLASSIFICATION OF PROTEIN FOLD FAMILIES

Authors

ORENGO CA FLORES TP TAYLOR WR THORNTON JM

Citation

Ca. Orengo et al., IDENTIFICATION AND CLASSIFICATION OF PROTEIN FOLD FAMILIES, Protein engineering, 6(5), 1993, pp. 485-500

Citations number

Categorie Soggetti

Biology

Journal title

Protein engineering → ACNP

ISSN journal

02692139

Volume

Issue

Year of publication

1993

Pages

485 - 500

Database

ISI

SICI code

0269-2139(1993)6:5<485:IACOPF>2.0.ZU;2-B

Abstract

We have developed a method for identifying fold families in the protei n structure data bank. Pairwise sequence alignments are first performe d to extract families of homologous proteins having 35% or more sequen ce identity. Representatives are selected with the best resolution and R-factor to give a nonhomologous data set. Subsequent structure compa risons between all members of this set detect homologous folds with lo w sequence identity but highly conserved structures. By softening the requirement on structural similarity, families of analogous proteins a re obtained that have related folds but more diverse structures. Repre sentatives are selected to give a non-analogous data set. Starting wit h 141 0 chains from the Brookhaven Data Bank, we generate a set of 150 nonhomologous folds and a set of 112 non-analogous folds. Analysis of sequence and structure conservation within the larger families shows the globins to be the most highly conserved family and the TIM barrels the most weakly conserved.