ITA
ENG

Evaluation and improvement of multiple sequence methods for protein secondary structure prediction

Authors

Cuff, JA Barton, GJ

Citation

Ja. Cuff et Gj. Barton, Evaluation and improvement of multiple sequence methods for protein secondary structure prediction, PROTEINS, 34(4), 1999, pp. 508-519

Citations number

Categorie Soggetti

Biochemistry & Biophysics

Journal title

PROTEINS-STRUCTURE FUNCTION AND GENETICS

ISSN journal

08873585 → ACNP

Volume

Issue

Year of publication

1999

Pages

508 - 519

Database

ISI

SICI code

0887-3585(19990301)34:4<508:EAIOMS>2.0.ZU;2-U

Abstract

A new dataset of 396 protein domains is developed and used to evaluate the performance of the protein secondary structure prediction algorithms DSC, P BD, NNSSP, and PREDATOR, The maximum theoretical Q(3) accuracy for combinat ion of these, methods is shown to be 78%. A simple consensus prediction on the 396 domains, with automatically generated multiple sequence alignments gives an average Q(3) prediction accuracy of 72.9%. This is a 1% improvemen t over PHD, which was the best single method evaluated. Segment Overlap Acc uracy (SOV) is 75.4% for the consensus method on the 396-protein set. The s econdary structure definition method DSSP defines 8 states, but these are r educed by most authors to 3 for prediction. Application of the different pu blished 8- to 3-state reduction methods shows variation of over 3% on appar ent prediction accuracy. This suggests that care should be taken to compare methods by the same reduction method. Two new sequence datasets (CB513 and CB251) are derived which are suitable for cross-validation of secondary st ructure prediction methods without artifacts due to internal homology. A fu lly automatic World Wide Web service that predicts protein secondary struct ure by a combination of methods is available via http://barton.ebi.ac.uk/. Proteins 1999;34:508-519. (C) 1999 Wiley-Liss, Inc.