Y. Guermeur et al., Improved performance in protein secondary structure prediction by inhomogeneous score combination, BIOINFORMAT, 15(5), 1999, pp. 413-421
Motivation: In many fields of pattern recognition combination has proved ef
ficient to increase the generalization performance of individual prediction
methods. Numerous systems have been developed for protein secondary struct
ure prediction, based on different principles. Finding better ensemble meth
ods for this task may thus become crucial. Furthermore, efforts need to be
made to help the biologist in the post-processing of the outputs.
Results: An ensemble method has been designed to post-process the outputs o
f discriminant models, in order to obtain an improvement in prediction accu
racy while generating class posterior probability estimates. Experimental r
esults establish that it can increase the recognition rate of protein secon
dary structure prediction methods that provide inhomogeneous scores, even t
hough their individual prediction successes are largely different. This com
bination thus constitutes a help for the biologist who can use it confident
ly on top of any set of prediction methods. Moreover the resulting estimate
s can be used in various ways, for instance to determine which areas in the
sequence are predicted with a given level of reliability.