S. Jones et al., DOMAIN ASSIGNMENT FOR PROTEIN STRUCTURES USING A CONSENSUS APPROACH -CHARACTERIZATION AND ANALYSIS, Protein science, 7(2), 1998, pp. 233-242
A consensus approach for the assignment of structural domains in prote
ins is presented. The approach combines a number of previously publish
ed algorithms, and takes advantage of the elevated accuracy obtained w
hen assignments from the individual algorithms are in agreement. The c
onsensus approach is tested on a data set of 55 protein chains, for wh
ich domain assignments from four automated methods were known, and for
which crystallographers assignments had been reported in the literatu
re. Accuracy was found to increase in this test from 72% using individ
ual algorithms to 100% when all four methods were in agreement. Howeve
r a consensus prediction using all four methods was only possible for
52% of the dataset. The consensus approach (using three publicly avail
able domain assignment algorithms (PUU, DETECTIVE, DOMAK)) was then us
ed to make domain assignments Wr a data set of 787 protein chains from
the Protein Data Bank. Analysis of the assignments showed 55.7% of as
signments could be made automatically, and of these, 13.5% were multi-
domain proteins. Of the remaining 44.3% that could not be assigned by
the consensus procedure 90.4% had their domain boundaries assigned cor
rectly by at least one of the algorithms. Once identified, these domai
ns were analyzed for trends in their size and secondary structure clas
s. In addition, the discontinuity of each domain along the protein cha
in was considered.