Motivation: Evaluating the accuracy of predicted models is critical for ass
essing structure prediction methods. Because this problem is not trivial, a
large number of different assessment measures have been proposed by variou
s authors, and it has already become an active subfield of research (Moult
et al,, 1999). The GASP (Moult et al., 1997, 1999) and CAFASP (Fischer et a
l., 1999) prediction experiments have demonstrated that it has been difficu
lt to choose one single, 'best' method to be used in the evaluation. Conseq
uently the CASP3 evaluation was carried out using an extensive set of espec
ially developed numerical measures, coupled with human-expert intervention.
As part of our efforts towards a higher level of automation in the structu
re prediction field, here,ve investigate the suitability of a fully automat
ed, simple, objective, quantitative and reproducible method that can be use
d in the automatic assessment of models in the upcoming CAFASP2 experiment
Such a method should (a) produce one single number that measures the qualit
y of a predicted model and (b) perform similarly to human-expert evaluation
s.
Results: MaxSub is a new and independently developed method that further bu
ilds and extends some of the evaluation methods introduced at CASP3. MaxSub
aims at identifying the largest subset of C-alpha atoms of a model that su
perimpose 'well' over the experimental structure, and produces a single nor
malized scare that represents the quality of the model Because there exists
no evaluation method for assessment measures of predicted models, it is no
t easy to evaluate how good our new measure is. Even though an exact compar
ison of MaxSub and the CASP3 assessment is not straightforward, here we use
a test bed extracted from the CASP3 fold-recognition models. A rough quali
tative comparison of the performance of MaxSub vis-a-vis the human-expert a
ssessment carried out at CASP3 shows that there is a good agreement for the
more accurate models and for the better predicting groups. As expected, so
me differences were observed among the medium to poor models and groups. Ov
erall, the top six predicting groups ranked using the fully automated MaxSu
b are also the top six groups ranked at CASP3. We conclude that MaxSub is a
suitable method for the automatic evaluation of models.